åäö domains for dummies
June 1, 2011 | Posted in Internet standards | By Cristian Herrera
I have seen that there is quite a lot of confusion out there about so-called preferred username domains or IDN / IDNA domains on the internet I was having a bit of a diskution January last year (2010) and after it seems people have become wiser about what this really is.
So, how do I find out what it is?
I can honestly say that I immerse myself more than I need in this topic, However, I have had enough track to be able to give accurate advice.
But you should know how something on the Internet and Internet-Standards works then there is no better source than to cave in the old RFC (request for comments) used in the development of the Internet by the IETF (The Internet Engineering Task Force).
They RFC that defines the IDN is f.n RFC4690 and RFC4290, for IDNA RFC5890 and RFC5891.
First, we sort out some concepts.
In order for us easier to understand what we read, we must first clarify the concepts.
Binary code / hex kod
All data is processed by a computer is in binary code, this is really nothing more than the integer which is calculated on radix 2 instead of 10 (which we are accustomed).
The binary system is used for data transferred at the beginning of the electrical impulses 1 for high voltage and 0 too low and these are often grouped in teams of 7, 8, 16, 32 and 64 bitar, where each bit represents a one or zero.
Therefore, also used frequently radix octa (8) and hexa (16) when it is very easy to convert binary to radix these, t.exe
in 8 pieces with all the studios would be
binary 1111 1111 or 0b11111111
hexa FF or 0xFF
or 255 Decimalt.
Thus, 4 bit binary equivalent 1 bit i hexa, therefore often used hexa for readability, a byte or 8 bitar = 2 bitar hex, just look at the HTML color codes, they are usually in the hexa RGB (red, Green, blue)
ASCII
American Standard Code for Information Interchange – A code page that converts ordinary integers to characters such as “a” “b” “c” and “1″ “2″ “3″. This code page based on a 7 bit integer and therefore it can only adopt 128 values.
Unicode
Unicode is the standard to be developed to be unified all languages and character encodings in the same system, This system is much more complex and would need a book to explain something, when its operation, but in short we can say that it can be divided into utf-8, utf-16 and UTF-32 where the number indicates the number of bits used, at greater than 8 – bit must sign into smaller “package” when the data communication in principle take place in 8 bit sequences.
UTF-8
UTF-8 is the standard now slowly replacing ASCII, UTF-8 are all ASCII characters in the same “position” as before. UTF-8 has a “variable-length” and can span up to 3 byte. UTF-8 is now standard for UNIX-like systems such as Mac and Linux, and the recommended standard for Internet, However, Windows itself today internally by utf-16.
DNS
Domain name system – Det system som översätter våra domännamn till internetadresser, Each address consists of a sequence of 4 bytes then a sequence of 4 integer between 0-255 looks like a s.k xxx.xxx.xxx.xxx ip address, DNS system translates our domain names to IP addresses, so each time you request a resource on the Internet based on a name, you will do a DNS lookup which will give you the correct IP address.
Punycode
Punycode was developed to make IDN domains, there is a way to present a text string that has unicode characters with only ASCII characters. The xn-- which we are accustomed to seeing in idn domains are just a prefix called ACE, and are not included in punycode, probably it is a way to tell clients to interpret this to “This is a Punycode encoded domain, deal with it right now”
ACE
ASCII Compatible Encoding – This is precisely the xn-- prefix added to the Punycode string
IDN
Internationalized Domain Name – A system that was created in order to have what we call åäö domains, this is not strictly true, IDN support all Unicode characters, and this allows you to type with krylliska, Greek and chinesiska characters, t.exe
Here he you test how your IDN domain looks for the underlying Internet system.
IDNA
Internationalizing Domain Names in Applications – Judgement standards and recommendations should be followed when writing a program or a system to handle IDN domains, such as browsers and email programs.
Protocol / Network Protocols
To different machines to understand each other so it is important that they speak the same language, or as it is called protocol.
The language is at the bottom of the Internet is called TCP / IP and that is where all the connections are, “of” therein lies the DNS (Name of protocol), HTTP (Internet pages), FTP (a file transfer protocol) and all other services which we are accustomed to using the Internet, all have their own language, or protocols.
HTTP
Hyper Text Transfer Protocoll – The protocol or language used to retrieve and send Websites, this is the language your browser uses to talk to your hosting, where your site is
Client
A program used to use a particular service which is managed by a server. Your browser and your e-mail programs are 2 example of this
Server
A server is a computer that provides a particular service which is then a client can connect to. For example, this page is located on a server that your web browser (client) download from this page via a language called HTTP.
Shift Sensitive
Case sensitivity – Some systems make no distinction between uppercase and lowercase letters, while others do. For example, in Windows is CristianHerrera same as cristianherrera, This does not apply to systems based on the POSIX standard, systems such as Linux and UNIX, This has meant difficulties in communication between systems.
In this paper we will discuss just that as some like to write their domains of camel hump standard, ie to use a capital letter as a separator between sammskrivna words.
The story of IDN (fast version)
In the beginning, could not manage Internet 8 Bit transactions over networks and all the characters that could be transmitted in encoded 7 bitars segment, and when information is sent as an integer so used so-called ASCII tables to interpret the information to text.
Today you can send 8 bit segments, and therefore it is easy to use utf-8 characters and this gave room to begin internationalizing domain name system, so that you can use any character, Arabic, Greek, extraterrestrial hehehe, yes maybe no alien, but all that we have on our planet in all cases.
The system must be backward compatible
Since the Internet is greatly expanded and the DNS protocol is built to withstand the ASCII standard was one of the requirements was to the whole system is backward-compatible with the old system (Just imagine doing a global project to update throughout the Internet), and therefore developed punycode, to have an ASCII representation of Unicode strings or texts if you prefer to call them for the.
Buissiness as usual
In other words one can say that the underlying system is virtually untouched, Our web servers, e-mail servers and DNS still understand only 7 bitars ASCII, so the whole system is based simply on the “translate” ASCII to Unicode and vice versa and that is what makes Punycode!
On pages 6 – 7 i RFC5894 which is a summary of IDNA can read that the translation between the domain and IDN Punycode version shall take place before and after the request reaches the DNS system, and in this way you avoid updating the entire infrastructure such as DNS system and mail system, etc. etc.. This allows the old system can coexist with the new system without the need to update much of the Internet, which would be impossible to coordinate the many servers is on small and large businesses around the world.
An amusing detail about the shift-sensitivity.
As I skimmed through the documents I found a funny little detail in terms of shift-sensitivity, the traditional ASCII system allows the use of uppercase and lowercase letters, and, t.om that you mix them, this is not allowed in IDNs, as this is very language-dependent, it may even be that in some languages, you have no common – uppercase version of some letters, therefore allowed only lowercase letters in an IDN domain.
Everything does not apply to the Swedish language
All that must be included in the report when IDN was created does not apply to the Swedish language, takes account of all the languages that have representations in the Unicode standard, where also Turkish, Arabic and many more. So what is interesting for us is, why does it not still?
What is it that does not work?
When the underlying DNS system has not been touched so it means that the translation is managed by the client, ie, Your browser or email program.
But even web applications will then handle the translation between IDN and Punycode, So if for example your twitter programs or other programs you are using does not support this, look for another client, try some different programs.
What do I do with services that do not support IDN such as Google Analytics?
Already, if you must register for a service that does not recognize the IDN domains, go for example into .IE's IDN converter and find out how your ASCII representation of the domain looks like, This will.
In theory, one should also be such as email to an IDN domain in a client that does not support the. If you just type in their ASCII representation in their email client instead of the IDN, eg webmaster@xn--domn-noa.se instead webmaster@domän.se so shall it be entitled, then mail protocol is dependent on the DNS system, which in turn only handle ASCII. Yes, in theory, all systems work if you use out of the ASCII representation, the DNS system is untouched.
How do I do when I links?
I would recommend that you use the IDN name in the anchor text but the ASCII representation of the link itself. ex:
[code] <a href="http://xn - domn-noa.se">@ domain.com</a>[/code]
If you have a good CMS will take care of this automatically.
I would not worry about Google, Google is a web client as everyone else and have IDN support, G will interpret this as “@ domain.com”
Yes, it's all for this time, if you want to know more, I would recommend to scan through the RFC documents, särskilt RFC5894 that is more a summary of the entire, However, a word of caution, there is much text and not the easiest English.
Sorry I did not have time to cover more right now, but this should serve as an introduction to the subject and I hope I managed to straighten out one or two question marks at any, and unless otherwise, so you've got a right nice online glossary at the beginning of the document.
Even ghosts IE6
November 2, 2010 | Posted in Internet standards | By Cristian Herrera
Läste denna artikel på idg lagom till allhelgona om att väldigt många företag kommer att vänta med att uppgradera sig till Windows7, och stannar med XP ett tag till.
Det första som slår mig är att detta måste bero på att många är låsta till XP tack vare att de har köpt in webbaserade system som är beroende av Internet Explorer 6!
Anyway, utan att säga mer hänvisar jag till vad som står i slutet av denna artikel. Jag citerar:
“Även analysföretaget Gartner har granskat förutsättningarna för skifte till nytt operativsystem och konstaterar att det största problemet för många företag är de inlåsningar som många gjort till Internet Explorer 6. Det innebär att cirka 20 procent av företagen kommer att få en högre kostnad än de räknat med vid byte till Windows 7.”
And, vad skall man säga? IE6 spöket lever än….
Microsoft överger Silverlight?
October 31, 2010 | Posted in Internet standards | By Cristian Herrera
Alla som har arbetat med webben vet vilken mardröm det har vart att bygga siter som renderar likadant på alla webbläsare och plattformar, värst tror jag att det var runt tidigt 2000 tal, när vi var tvugna att koda om alla siter i minst 4 olika versioner.
Många webbutvecklare började tyvärr utveckla applikationer och webbsidor som endast fungerade i IE, och därför sitter idag stora organisationer på osäkra it miljöer.
Då Microsoft vart dominerande på marknaden har de kunnat strunta helt i standardkompatibilitet och detta har vart ett stort problem när man utvecklat designer.
Historiskt sett har det tagit 3 ggr längre tid att utveckla någor som renderar korrekt i IE, dock har microsoft blivit bättre på detta i sina senare versioner av sin webbläsare.
Jag snappade upp detta på Webmaster Network idag där man får uppfattningen av att MS skall sluta arbeta med Silverlight, så jag vart tvungen att kolla upp detta lite närmare.
Efter lite rotande hittade jag en artikel på ZDnet.com där Bob Muglia, Microsofts ansvariga för server och verktyg där de ansvarar för produkter som Windows Server, SQL Server och Visual Studio, nämner att Silverlight kommer att fortsätta att vara ett cross platform lösning dock att HTML5 är den ända verkliga cross plattforms lösningen för allt.
Om jag tolkar allt rätt så kommer Microsoft att verka för att deras webbläsare skall vara standardkompatibel och att den skall stödja HTML5 fast de kommer att fortsätta att utveckla pluginet Silverlight.
Jag antar att MS har insett att kunderna vill ha en standardkomptibel webb och inte behöva ha plugins för varje webbsida som de skall besöka.
Ironiskt nog när jag skulle se en video som handlar om IE9 och standardkompatibilitet så visade det sig att det krävs silverlight för att se den videon.
För dom som vill testa html5 videospelare kan man idag göra det på youtube.
and this kan man testa lite av vad HTML5 kommer att klara av, om man har en standardkompatibel webbläsare.