hi everyone and change and this is mike and we're gonna talk about that big to inputs and first of all thank you all for attending i'll talk was when i don't here and this is a force time to glottic and i was talking to seven people about input so that nobody was kind of interesting stuff so i guess you have the guy is kind of interested in and that is really good for us so first of all i would like to time t-norm being because they are the first one that for that the there S all the audience as well who are really interested in non in using you all the languages and maybe last year we integrate that i would see the norm and that was you most were listing but we had some this solar discussion i don't it around ignore mailing list and things but honestly for us is it by testing which put how have one in the back stop and i would really like to thank john than the T S and we for the work and maybe let's talk that then i'll start let's talk you will be i'm going to talk about more about what are can put them at the side then why i help protect input matters that a quite and a bit of terror ticket part behind it and then the projects currently what we are working on so that you really get to know about more in a boat you predicted stuff and that's just for the i didn't have to the and if you are having any questions at any time nice feel to interrupt us so that we can and so at that point at so i'll be happy to take down the questions as well so all let starts a one of the input matters because i did this slide because most of you are not over know what i input like this are because most of the new bodies are using the in the this spanish keyboard or all the english keyboard or the next a keyboard so i thought it would be really good idea to use it to have the slice like this so then i put ice of input matters roughly one is kind of the rest input matters and all the rest and dispose input methods so characterbased input matters basically in D and cool year or vietnamese we call you at as a transliteration best input matters why be qualities transliteration based bit because we have the conversion between be ask al products or like you know products in the other are to be similar we can all the languages so that is why we called be characterbased input matters and for the in chinese and japanese stuff the core let's it's a sentence was input matters because in those input matters you do you don't have a space in between the words so it's really complex to have these such important matters if you see how job a japanese input methods are the japanese a sentence looks like this looks like this this one a that is a one this that is the whole sentence and is nothing but we are names in japanese honestly i really don't know much about japanese but mike knows here so he has inputted those characters if you see that on most basis in between the characters but there are but naturally they are more strict be space in between the chinese and or japanese stick so that becomes really hot to buy you japanese and chinese onto the computer because apparently we have only i guess thirty to a in general i'm speaking about but you to alphabets at such what to buy be a cactus other than the english or be lacking characters it's really difficult job and if you see right now if i use you know the computer in my mother tongue that is not what i think is moderately i of this full force at it and if you see this state of current input matters the state of input matters on the next all after typing something you see like this i wasn't makes its kind of face why was for example i mean you want about like norm on my own on language on the deck still i ideally it should take twenty fives you still but apparently it takes our own it nine you strokes and that's makes me mad why need to die ninety still by a word which i could buy in a english or be or i know like a keyboard profile it us so the predictive text is one of the way we are trying to solve that problem so that you that have to buy the less you get some solutions and maybe use this life will make this happy and the need for such that big input methods i and it dislike baby force today because i was a listening to keynote by a date and let's more when that actually arms i mean four buttons now but he has shown he had shown the you with the next that that's okay and he shown some more statistics about the brazil so i thought why not why not are working because we have like one point two one billion a population out of which seventy four percent are you can just read alright and in the language and out of reach what you five to six of the whole population bunch of population they can understand english i explicitly i did this because i've been telling on europe since last seven days and i met several people and the have the misconception awarding get that everyone in get can understand english it's really false in there's a out of this population five two four six percent a percent of the total population their billion just an english and i potentially could be one percent of the you open the one point two billion they have the you want and they use your technology they use in the operating system or anymore well devices and for then if you don't you do better prediction kind of thing they're gonna not they are not going to use E do you a softer for example in in the last year officially someone be more when you companies they sell more than two million and burn devices and why it's so popular in india because in and right you get lot of three acts as well as you get good input matters apparently in this room as well we use all kinds of input matters indeed more while or one devices and if you can see the dallas adjusting we have twenty two of which any recognise languages and i'm not just groups and if you can see that the rest of the world could be so the and i good languages and the users you should provide good input matters to them so that they can so that it will be have to present the languages and another point is a are we are also having the that inputs or normal on tablet kind of thing and maybe for that we need putting matter size but and another thing for example if you know we one language and you got really good in typing one language and apparently you more stuff us be more than one language and we know one language really but what do we really don't know the are the language and to typing such kind of languages it makes a really hot for example if you go to china and david data like really good in chinese what if you tell them to type in english it because makes because they know the language but they are not really good in the particular language so that is the need of such input matters and let's talk about how we can implement such things in fact is because to get this additions it's really hard because we have the number of words in the school you know was and how you can predict the next one because you really don't know that okay what i'm going to say next so there are two techniques what is just we use some several techniques such a statistical techniques and you probably did very a pretty the next one so i'll be on it as a language model so language model is nothing but of we just consider the problem in and you and language what is the probability that one what would follow before that word for example like no i'm speaking something some something about the predicted X so you can guess my next flawed all my neck sentences would be are something regarding the language model so similarly in probably get ready are incomplete us or any and but matters that does the same thing then we have be simple language model in that what you can see that is the number of a princess of words and divided by the number of hold what's in the language so that you get the probability because somewhat some sentences some words they try to getting together well for example i'm going so whenever i say a i then probability of the next what would be and the more score and saying it's not be exactly what but just you probably so if you know little about do mathematics ideally don't want to going to the that a good that direction what its kind of boring and will not like you much so the amount goes sent is in i guess in nineteen sixties or seventies had propose a really good more T V that a visa like if you know the idea of history and in the hysteria meant the same than you can calculate the future so saint at is been using machine learning technics but you can just this team next word but you can just betting the next what but that probability is kind of eighty percent you client base a hundred percent goes wide so because we are humans and human mind these kind of "'em" because we really don't know what we would do next so that makes a really hard for the text prediction so you probably don't do would depends on the probability of D and probably previous words that is the basic thing what we of what is been used in the text prediction so we calculate do you need honest bigrams and by bigrams unigrams is nothing but that's a single word by defence is nothing but set up to words and diagrams is nothing but a set of to us so for example know normally so unique it on well known these is a kind of bigram and norm is also is a trigram so you can relate such probabilities on a huge part of course say we have will be and so words on a given sentence so we try to calculate the unique don's diagrams and or trigrams and depending on that to try to calculate we try to predict the next work support example so for example containing said you have to instances aborting think is also norm is also and norm shall is also so there are two different words and start and stop on the team but space is in what you can consider the special symbol so that you can guess this sentence has been started and this sentence has been finished so in this example say it should know would be vocabulary in you a document or in your corpus here you will contribute start what i ease also stall and that show and if you want to calculate the you need a model you need ample a probably just for this morning is D probably you might want to consider the probability of you what glottic so it's one S to sixteen how com is one S to sixteen a because to got it is used when you understand the whole corpus and the number of words in the corpus to sixteen so the probabilities one it into sixteen similarly the probability of ease is do what is a team into sixteen this if you can apply to see mythological here so you can get D you need on model so similarly if you want to apply the same logic into the background model as i said trigram or the lizzie a set of keywords so it's so you can on and divided by D starts time that means that a be probably the norm using used placing this whole corpus and in the number of sentences starting with just a startup scene so it's politics to but see so if you apply the same logic to the whole sentence for example of a probability of noam got X is also meant and start you want to do like this you want to a lady same logic to the was and then you will get like probably you go text asked you into probably you'll ease glottic starts a single and to that end so it's kind of motivated by so that's all about the paralegal part which is kind of then that is again beeps and that's like if you don't if you get the unknown synthesis kind of thing but i to so how to normalise such sentences but i really don't want to will be getting to that complexity so let's talk about the projects we are working on so one of the project is i was type english the that do we are working on so at this point of time i didn't get to them i couldn't talk i would be i posted melissa so i tried to demonstrated it's so it should so i guess okay so we implemented something like that as and i was in to implement that it supports most language which can be easily transmitted weighted so it doesn't support astonish already said it doesn't support chinese and japanese because extra more complicated step to conversion to chinese characters is necessary but practically all other languages which can be well after consultation it's already finished are supported and all where directly what input is already enough and it users the way known input method from the M seventeen and lot of the so users who know D's don't a need to get used to use stuff and the hope is to improve typing speed a lot by getting very good predictions and typing on the if you look have to select the hard work and most of the prediction comes from what do you the user types it learns from the user input and it one can speed it up by giving some topeka text for what the user usually types to it used to time needed for learning and if i mean explain these the prediction is based on the previous two thoughts on that i com database and if no suitable word can lose most suitable type them can be found in the database it for expect to i'm spare dictionary some shows predictors from huntsville dictionaries and it also uses times pay for collecting minor spelling it was and currently it's implemented in the front end five what's implemented in python and this a database for you see collide and i why should shoulder little bit how it works so so i'm kind the german i was typing was to first of all i delete everything which has learned been done so far too to demonstrate that S G and it so if i'd type some german text so you see the second time i typed at at i quit just selected that typing one that and see like because it be men but the next about based on the previous context actually i this type the last about so that support on the last to be because i did a typing mistake and so the first say a suggestion is no longer if i want to delete this from the database i can selected not this one but this control one and sell so know this suggestion this one from the database and to speed up this learning process i can that we didn't some no not text file i can select lot context five so some example i have few have some some book which that the system a date and now if look at some text in this book i can easily input the the same text again this very little typing the because it are just you see that i'm using the german typing boost actually what i typed years english so for the it doesn't really matter for that items what language you are using you can mix the languages freely just like this with key application for the on the way it does and currently we still have different engines for every language but i want to much is in much un languages you much few engines to support the same them saying which is in on use more number of engines it's to something else like for a nice model to so you can also do the same system for practically and the i don't know what this means that company come out here and or queen you see that the suggestions the first character of suggestion is in i'm will actually so we see only the first john more of the i've typed only one jumble and the first act of that suggested lots as the first run most is korean okay that's the or did i think for the demonstration and cool well i think so the current problem solved i was right you you can't use the same code to go other in jeans or if you want to use the same girl it's really tedious so we have started one more project and if you can it's it's an X prediction library of which is written in the vol a so that you can using audit of projects as well just nothing but you had to well the lab is nothing but V handle all the key here but key variance and decline have to just subscribe product expectation so that once you have subscribed you'll get a prediction as it it's and the next the next service we honestly need you have we need help in testing then this additions for improvements what new features because you are you guys at the uses and if you have some suggestions we we have a happy to implement those kind of things and again they huntsville additional is what we are using know i honestly don't think nobody meant instance will dictionaries this mean or a if your C D i don't know i mean loss of difference billy studies it's kind of maybe five to six years ago somebody created them and all that this to something huntsville dictionaries and we would like improve grows and also a creation of we got was that is the thing which is really need it for us and in all what we it's really hard to get if we call was for this additions and so in future we might want to add some grammatical analysis as well so for that corpus might be interesting at the moment we are doing only this markov model stuff and having a big corpus doesn't actually had that much if you need to know you which takes like all of picky pdf for english and the prediction based on the simple markov model for the next about this something one out of two hundred fifty or one out of five hundred which isn't very good so it works only where at the moment if it's the textual on from is what the user actually uses so normal users don't hide and all the don't try to know complicated style like oscar wilde or people tend to write a better vehicle for lunch or something like this or the button to be could use that type just much more repetitive and having really learning from the user input is the markov model much more help for them the meeting at be corpus and and maybe that's thank you thank you only thing but you all your book on predictive implemented this are that your demonstrated also held at E users five we didn't get and if you become katie use us so far actually V to get pretty very little feedback so i'm is asked for test as i asked some of the type colleagues to tested in court some nice suggestions for improvements that right implemented but there wasn't that much is a feedback and that kind remember anybody from katie it works katie don't know so it's so obviously for the i think and useful and it's context but roughly make a production in terms of one thing keyboards so i'm wondering you know what you thought if you give a thought to how we can take this and apply it when somebody's using on screen keyboard results we have more general issue of how we integrate i'd methods with on screen keyboards but i was curious what that you had the county doesn't get work this on screen keyboard spot and we want to make it work in future this one's thinking about and that this also one of the reasons why on each wants to put it into a liability because the nets will be easier to use from an on screen keyboard and with the current implementation and i've just one time problems can see what i think it makes much more sense for actually for myself when i type german or english i'm typing too fast so usually for me it's easier to just finish typing the about instead of looking and selecting but a nice at that many people in india are not comfortable with the way that consultation this time and hard time figuring it out and so for people who use computer for the first time in india it's very helpful if they get some suggestions after typing only if you let us similar like people on the touch us clean have difficulties typing i guess that it makes me wonder a question have you thought about whether they should be enabled by default in some languages should just we wanted if you choose indian input language program should just work like this by default yes of the on planning like to people but one meeting the people do we need to fix them up to code bugs for example when you try to integrate it as a text of input method but you need to fix shootings for example you if you're typing in a say if you're typing something in google you wouldn't want to situations i guess i'm which means that but it it's have display some suggestions and they don't say it don't function look up table gets into the way of the good suggestion so they overlap each other so it's minute to switch it off and on all the time actually we need that would be then what that for example if you want to type something in those the and in that case as well you wouldn't require suggestions as well we need to do to my i mean indies you can actually there is maybe i to control that in some way now so there are these input hints that you can apply to text entry fields you can say i don't want it's you know this calculator i want and you mac stuff and this field or you can say then in your inhibit the on screen keyboard which you know you could then maybe imply okay and want to hear well prediction so maybe we can extend that technique and apply that to other toolkits and things that we have no good at a for something like the google search the field at the moment because sometimes of course if you type in the balls i wanted if you remain used it also for checking or whatever and how to find out that the user is typing into the google search for years so i don't know how to do that at the moment i think that it may do the right thing on and right i i'm not completely sure but i think might be maybe in is in H T M L so we just have to your out of expose them through to get the to the right place a you mean that's and they hmms to that page maybe i what we should we should listen deca see there okay so another questions thank you very much okay