Overview

System Localizer
The Design and Translation Process
Non-Technical Perspective

The online System Localizer interface brings the desktop version to easier and broader audience. The System Localizer is a combination of support for the developer and support for translation management and was created specifically for software and IVR developers, game designers, and similar. The System Localizer offers many features, not all of which may be required for any one application. Examples:

Globally translating and controlling an Excel column of prompts or text with easy access to the “latest and greatest” version – the System Localizer imports the Excel and the translation can be performed around the world, also review and approval cycles by other linguists, all in a secure online environment.

Translating text buried inside code – the code files can be imported, the System Localizer will separate the text for translation from the code, and display the text in a manner accessible to translators with proper password. When translation is finished, the System Localizer will reassemble the code with the translated content. There are certain pre-existing imports to parse translatable text from code, but it is also inexpensive to order custom imports, exports, and parsing.

Improving the internationalization of developer’s hard code using the “automated code generator” feature of the System Localizer to export ready-to-use localized source code prepared to play or display other languages properly. This auto-generated code can be in almost any programming language and will follow a model set by the developer and include customized variables and approach.

Integrating the System Localizer Runtime Engine into the application and cease to worry about complex linguistic rules, because the Runtime Engine makes the decisions on personalization, translation and localization for the application on the fly at runtime.

Whatever the approach, there is probably a way to handle it in the System Localizer. Each primary user has own online database – not shared with others, plus levels of permissions and passwords for other participants. The databases were created specifically for private use and because they are not shared with other customers, calls to the database can be made using the API in a manner similar to an in-house database.

Using the online System Localizer can start small and grow over time. The online software is easy to use or complex, depending upon how the developer desires to use it. If the complex route is chosen, a wide variety of possibilities are offered, giving developers the full freedom they need, with a choice of design approaches, so that each developer has complete control over own project. The goal of the System Localizer software is to assure that whatever the developer creates can be translated and personalized with less effort and will function universally in up to 200 languages and dialects.

Background

Whenever software and other technologies are created and are planned for translation, localization, and personalization of the user experience, a series of steps should occur.

•    Translation is the exchange of words from one human language to another. So: an orange in English = una naranja in Spanish.

•    Localization is the adaptation of the product or the translation for a locale or specific target audience, and the re-arrangements of its parts to be grammatically correct for the new language. So, for Puerto Rico, orange = chino, not naranja. Another example is the date 6/14 (or 14/6 in Europe): in the U.S., this would be read as “Sunday, June 14th”, whereas in the United Kingdom, this is more properly spoken as: “Sunday, the 14th of June”.

•    Personalization is the adaptation or variance of an element to please an individual or company, such as inserting a company name, a preferred vocabulary word, or a company logo. Also included in this stage are changing telephony menus, such as “press 1 for Sales” for one customer, but “press 1 for Technical Support” for another.

When a software product is created, many assumptions are built into its design. Any time the software is altered to accommodate new languages or dialects, some of these assumptions cause the software to cease functioning properly, and it must be changed in order to accommodate linguistic differences and new preferences. For example, software may be coded to present, “Your bank balance is <amount>”. The <amount> indicates the place in the sentence where the amount of money will be inserted by the computer when the software is used. Typically, an assumption is made that for this presentation item, there will be text followed by a currency amount. However, other languages can formulate the same sentence differently:

•    “Your bank balance <amount> is.”
•    “<amount> is your bank balance.”
•    “Balance <amount> for your bank account is.”

Moreover, within this sentence, the variable <amount> requires more computation. In English, the result for $2,403 is typically read as “two thousand four hundred three”. In other languages, however, the same number might be read as “Two thousand four hundred zero-ten three”, or even “twenty four hundred and three”.

The System Localizer resolves vagaries that occur during technology translation, without the developer needing to know the rules or speak the languages. The System Localizer’s new online interface includes a Developer Center that assists companies in creating internationalized products.

An “internationalized product” has a slightly new way of handling the data that creates the user experience – specializing in variable data that changes every time a phrase is spoken, like “You have <5> messages. Message <1> received on date at time”. The programming does not know very far in advance how many messages there are until it needs to present the information to the user, or which message is being referred to, or when the message was received. And when the information is known, there may be up to 25 ways to say the number 5 in some languages. How does the programmer write the algorithm that will decide which to use? Also, the word for “messages” may vary in other languages, because some languages have multiple plurals. English as singular (1) and plural (2 or more) – many languages have several more states than these, such as 2-5, or any number ending in 13 except 13 itself, or countless other seemingly random rules regarding pluralization. In English, we think of “message” and “messages”; in other languages, the program must also accommodate “messagi”, “messagu”, “messagay”, or others.

The System Localizer assists developers to internationalize their product without requiring any knowledge of these types of linguistic rules, so that the product will play or display correctly in 200 languages and dialects without the changing the software or its code, once the software has been internationalized. And once internationalized, other managers and professionals can control the content and the translations with very low risk of error, with significant benefits to personalization.

System Localizer compared to Automated Translation

The System Localizer is a significant step above auto-translation in terms of quality. The System Localizer is a complete control mechanism over the translation and localization process. The developer can choose the manner to carry out the translation and localization – either by a human or by a machine.

The System Localizer gives the software developer the ability to choose what should be used for what translation, be it auto-translation, professional translators, or internal personnel. Each development manager knows his or her target and budget. The System Localizer assists the team in reaching the target in the manner most appropriate for the developer’s circumstance.

To be clear, auto-translation software comes in a wide range of quality, from incomprehensible to passable. Many auto-translators produce very questionable – even silly – results. The quality of auto-translation really depends upon the software package purchased and the content you wish to translate.

With the System Localizer, the developer/product manager is in control of the quality of the language of its software.

Developer Center

The System Localizer Developer Center is the centralized location for control of internationalization. The Developer Center:
•    Assists developers through the internationalization process
•    Assists translators throughout the translation cycle, including glossary creation
•    Assists reviewers to review translations by others
•    Assists Q&A and QC personnel with written and audio concatenation checks
•    Provides an approval process

Then, as a last step, a button is pressed and either:

•    Usable, localized hard code is issued in C#, C++, Java, XML, VXML, and other computer programming languages. This code is usable source code, and can be customized, upon request, to the developer’s internal naming convention.

•    A condensed version of all database elements is extracted into a binary file. This file, called an Extract File, will be transferred to a place where it can be found by the developer’s product, website or embedded application. This Extract File feeds the System Localizer’s Runtime Engine. The Runtime Engine is an alternative to code, and requires much less development effort when taking an application into multiple languages. The Runtime Engine is a DLL/shared library that can make all of the localization decisions for the software on the fly when it is running “live”. The Runtime Engine does not speak words, or show text, or play audio files. Rather, it tells the host application what to say, what to show, and what to play. Then, the application performs all tasks.

So, the Developer Center is a localization hub used by all – used by the developer to internationalize the base system, by the translator to insert the translation online, and by the project manager to keep track of the project, to name a few people. If audio files are involved, the audio files can be uploaded to the Developer Center for testing and concatenation QA from anywhere in the world. Even basic speech recognition can be translated and tested in many languages.

Using the Developer Center and Translation Center

Step #1: Determine whether data is Fixed or Variable
Variable data is the part of an entry or sentence that changes every time the data is presented, such as the number “5” in the sentence “You have 5 messages”. Fixed data is everything else – and while certain words may change based on the variable (i.e., “message” vs. “messages”), these linguistic variances are still considered fixed data because they do not matter to the application programmer – it’s not their problem. The first step in the localization process is to internationalize the original language for the software – called the “master language.” Note that in the examples of complexities earlier in this document, most of the information that changed word order and content had variable information inserted. A full statement like “Thank you for calling ABC Company” is considered fixed data. Variable data includes dates, times, numbers, telephone numbers, email addresses, people’s names, company names, serial numbers, and any information that will be changing each time data is presented when the software is live.

First, the software application is studied, and text for translation divided into:

1) Fixed sentences/paragraphs with no variables. (“Thank you for calling ABC”)
2) Text with Variable Information (“You have <X> messages” or “Item shipped <Date> <Time>”)
3) Other media can also be categorized, such as images, movies, and audio, if developer wishes to vary them with the user or some other criteria.

Developers can choose whether they want the System Localizer to handle all presentation entries, or only the Text with Variable Information. Personalization can also be considered variable information; the System Localizer enables hundreds of personalizations of the user data, without changing the host software’s code.

Step #2 Assigning name to each event – entry
In order to assure correct, natural translation into other languages, it is necessary to work in complete thoughts. Typically, software is designed to re-use phrases. For example, consider the phrases: “User added…”, “User deleted…”, “User changed…”. It is logical to assume that parts of these voice files can be re-used in other contexts, such as “Mailbox added…”, “Mailbox deleted…”, “Mailbox changed…”.

Unfortunately, these words that are common in English (“added”, “deleted”, and “changed”) are different in some languages depending on whether you’re talking about a user or a mailbox. Therefore each full thought, or sentence/paragraph, should be identified, and given an alpha-numeric name, like “UserAdded” or “MboxDeleted”. These sentences/paragraphs are called entries.

If the application is text-only, the entries can be up to 30 characters. For audio products, like telephone or kiosk audio files, the names should be no more than 10 characters because this name is the basis for a file naming convention that facilitates the handling of multiple thousands of voice files.

Step #3 entry online – original master language
There is a special way to enter this data online into a secure, private database via the web. This database need not be accessed by the developer’s software but data can be exchanged via API after the design process. This online accessible database is for storage, translation activity and eventual creation of usable exportable automatically generated hard code or Runtime Engine Extract files.

The entries are entered into the database via the web in the Developer Center. Entries without variables data can also be mass-imported. For the entries with variable information, a new type of variable will be inserted in place of traditional variables. These new variables are internationalized variables, such as dates, times, numbers and more. These internationalized variables are vastly different from what has been used in the past – and are also central to the “design-once-translate-forever” philosophy.

The theory behind the change is that the traditional approach to localization is based upon “exceptions”. Something is translated, then, when it doesn’t work linguistically in another language or a client wants to personalize the content, more code is added or modified or data tables are changed by engineers until it works. The “localizing by exception” approach generally results in programming such as “if English then…if German then… if Mandarin then…” Because programmers are designing linguistic algorithms for languages they may not even know, there is an inherent risk of error, requiring extensive testing, time-consuming recoding, and data table tweaking. With the System Localizer, once your software works in the original master language, the code will work in all languages.

To make this “one code fits all” possible, the traditional approach to using variable information has been revamped. A byproduct of this revamping, is that it is now much easier to use variable information. Now, with the Developer Center, variable information is painless to handle, and the translations are natural and high quality grammatically in all languages.

Step #4 Translation Mode
Once the original master language has been entered, the translation phase begins. Translations and personalizations can be managed by marketers, managers, and customers. The Developer Center offers a way to assure that the software’s original creators have strong quality control, even in languages they do not speak.

3 approaches to translation are possible:

1) Professional translators
2) Internal staff or customer personnel
3) Third party auto-translation software

With regard to quality:

#1 Professional translator: This approach is the most likely to be outstanding. Quality is most assured if reviewed and approved. You can use your own translators, or the translators of @International Services, or use one of the recommended professionals from the Directory of Approved Translators.
#2 Internal or Customer staff: The resulting quality will depend upon your staff or your client’s skills.
#3 Auto-Translation: Always a roll of the dice.

How translation works:

Translators perform their translations online, in secure environment. Each time “SAVE” is clicked, a concatenation check is performed and displays the translation with all variable information inserted, so that the translator can see the entry “in use” and assure that the result is correct.

Reviewers can follow in the translator’s footsteps to review the translations performed. There is also an approval process, if desired. If your company wishes to keep a glossary, the words and expressions can be pre-translated by clicking a button, to increase consistency.

Also from translation mode, other elements can be localized to be specific to the exact target market, such as pictures, movies, background audio.

Step #5 Personalization
Personalization uses the same “mode” as translation. Text and prompts can be personalized for the customer, with as many variations as desired per language family. Additionally, media can be personalized by swapping company logos, background pictures, and other elements.

Variable information can also be personalized, date and time formats can be changed, and word usage can be changed.

Menu order can be swapped based on common information and a customer preference variable. One customer can have “For English press 1, for Spanish press 2”, and this can be swapped online to be “For French press 1, for English press 2” for a different customer. The same applies to software interface menus and pulldown lists; items can be swapped or even deleted without changing the original software code.

In some traditional approaches to personalization, a copy of the original application is made, and then the prompts that need to be personalized are changed. With the system Localizer, the personalization is considered a new “language”. So, if your system original master language is EnglUS (English for the U.S.), then a personalization might be EnglUSHD (English for the U.S. for Home Depot).

If you are using the Runtime Engine of the System Localizer, languages are organized in a hierarchy, and called using a language precedence order. So, for Home Depot, the language EnglUSHD would be used ahead of any other English variant. If the entry is not found, then it will default down the line to EnglUS. There can be up to 8 items in the language precedence order without losing notable speed. If you are using the automatically generated hard code approach, hierarchies can be customized and hard code exported in which the online System Localizer performed the hierarchical searches.

Step #6 Review and Approval
There is a Review and an Approval mode. Other translators can proofread and edit translations, or customers can input their thoughts and ideas.

@International Services also offers “watchdog service”. “Watchdogs” are professional translators who, upon request, review translations performed by your translators or your customer’s translators or staff. Watchdogs can perform spot-checks or do full review of your translations. The Watchdog mode is discreet and private, and designed not to offend those doing the actual translation. Reports are viewable only to those with permission.

Step #7 Reports and Scripts
Once all data has been entered, a wide variety of reports and scripts can be created, including talent scripts, database studies, database queries, and similar.

Step #8 Audio File Verification
If your software uses audio files, there is an online Audio File Tester. This Tester is used to review all audio file content to ensure that the audio content matches the script. Plus, there is an Audio Concatenation feature, to play all of the entries (sentences) sequentially, with variables inserted, to assure that the complete phrases function as planned and sound the way they should. In translation mode, there is a written concatenation check, the idea being to eliminate concatenation errors before recording (and thereby saving money on recording costs). This audio test will catch any lingering possible errors, and most errors can be corrected by simply editing an audio file.

Step #9 ASR Code Translator
For technologies that use speech recognition, new localization ASR design area is expected to be available in 2011.

Step #10 Press a Button for Code or Extract
When the language translation or personalization has completed all of its cycles (developer original master input, translation, review, approval, testing), the chances of success are outstanding, far greater than any usual approach. If @International Services performs the translation and testing, the results are guaranteed.

To finish the localization cycle, the host software needs to receive its data, its auto-generated hard code, or Runtime Engine Extract File. Press a button and code can be exported as customized C#, C++, XML, Java, VXML, and more. Or, press a button and receive an Extract File, email to the developer to place in or near the software. The Extract file contains a highly condensed agglomeration of the database decision-making content – but without the database, and feeds the System Localizer Runtime Engine.

Step #11 Testing
The System Localizer software exports scripts for talent recording for audio products, text scripts for general use, database content studies. Plus, it offers online concatenation testing, online audio concatenation testing. Plus following audio file recording, the audio files can be heard and their content studied online as well.