Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)
Hello everyone, I agree with Marco for the value addition of multilingual feature (if solved even for subset of the language). However, the overwhelming nature of the problem will also be important when time is considered. Moreover, Wikipedia have 285 versions (i.e. 285 languages) as of August 2012 [1] and DBpedia have 119 language (versions/chapters) [2]. Also, as acknowledging the point made by Pablo, making the overall flexible architecture would demand for POS taggers, Dependency parser and other available NLP tools for each language under consideration (which can be research problem/s by themselves). However, in absence of such tool, linguistic analysis could be of help but requires an expert in that language to work out, who can help in identifying important entities in a given query (i.e training using heuristics) - a problem worth exploring but in given time frame it can be difficult (correct me if I am wrong). However establishing a QA for English or leveraging existing knowledge would be easy and wrapping it up with some language translation API to allow user to ask question in particular language and get answers accordingly, which in my view would be more feasible (a humble request to all reader to provide insights, if any). - Ankur Reference : - 1) http://en.wikipedia.org/wiki/List_of_Wikipedias 2) http://dbpedia.org/About On Mon, Mar 3, 2014 at 2:45 PM, Marco Fossati hell.j@gmail.com wrote: Hi everyone, On 3/3/14, 7:55 AM, Sourish Dasgupta wrote: Hello all, I agree with Marco that we should be concentrating on English for the moment. This is because every language has some innate characteristic linguistic nuances which may not be found in other languages. However, I still feel that techniques borrowed from computational semantics might be helpful in improving the accuracy significantly. By and large languages follow the SVO structure (Subject Verb Object). It covers approx. 42% of world languages [1]. So the research insights that we get in working with English, both at a statistical and at a linguistic level, might be very important for future extension. That's exactly my point, more technically explained. If we manage to implement simple questions that fit well in all SVO languages, this would be a dramatic added value in terms of multilingual support. Cheers! Sourish [1]: Russell Tomlin, Basic Word Order: Functional Principles, Croom Helm, London, 1986, On Mon, Mar 3, 2014 at 12:18 AM, Pablo N. Mendes pablomen...@gmail.com mailto:pablomen...@gmail.com wrote: I just want to suggest caution with multilinguality. Doesn't seem to me like an easy problem. QA is already hard enough in one language, if one tries to solve it for all languages at once, it will be overwhelming in three months. I'd suggest focusing on English, but not hardcoding anything that is language specific, keeping them in configuration files and properly engineered subclasses, thinking that one day it will all be ported to another language. On Mar 1, 2014 1:10 AM, Ankur Padia padiaan...@gmail.com mailto:padiaan...@gmail.com wrote: Hello Marco, This is inform that I am resending a copy of the mail send before as I forgot to CC the author of the referred paper who is working in the direction of development of QA system using DL as tool for language representation. I think Google Translator API would come handy to perform the conversion of an foreign language question to English language question for cases where particular knowledge or triple is missing [2] in given chapter and then after firing it against English Knowledge Base which do have one. However, there is a paper on Wh-questions by Dr. Sourish Dasgupta (cc) and its possible semantic formalization in description logics [1]. - Ankur. Reference : --- [1] http://arxiv.org/abs/1312.6948 [2] Approach taken in QAKiS. On Fri, Feb 28, 2014 at 4:54 PM, Marco Fossati hell.j@gmail.com mailto:hell.j@gmail.com wrote: Also, I think a crucial point will be the multilingual capabilities of the tool. In this way, all the DBpedia chapters can benefit from it. So, the first implementation should focus on very simple questions, but in multiple languages. WH questions would be great. Of course, this requires language-specific validation. We will definitely need the help of the worldwide community. Sounds like the project is getting more and more exciting! Cheers, On 2/28/14, 12:08 PM, Marco Fossati wrote: Hi Ankur, On 2/28/14, 2:00 AM, Ankur Padia wrote: Among the
Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)
Hi everyone, On 3/3/14, 7:55 AM, Sourish Dasgupta wrote: Hello all, I agree with Marco that we should be concentrating on English for the moment. This is because every language has some innate characteristic linguistic nuances which may not be found in other languages. However, I still feel that techniques borrowed from computational semantics might be helpful in improving the accuracy significantly. By and large languages follow the SVO structure (Subject Verb Object). It covers approx. 42% of world languages [1]. So the research insights that we get in working with English, both at a statistical and at a linguistic level, might be very important for future extension. That's exactly my point, more technically explained. If we manage to implement simple questions that fit well in all SVO languages, this would be a dramatic added value in terms of multilingual support. Cheers! Sourish [1]: Russell Tomlin, Basic Word Order: Functional Principles, Croom Helm, London, 1986, On Mon, Mar 3, 2014 at 12:18 AM, Pablo N. Mendes pablomen...@gmail.com mailto:pablomen...@gmail.com wrote: I just want to suggest caution with multilinguality. Doesn't seem to me like an easy problem. QA is already hard enough in one language, if one tries to solve it for all languages at once, it will be overwhelming in three months. I'd suggest focusing on English, but not hardcoding anything that is language specific, keeping them in configuration files and properly engineered subclasses, thinking that one day it will all be ported to another language. On Mar 1, 2014 1:10 AM, Ankur Padia padiaan...@gmail.com mailto:padiaan...@gmail.com wrote: Hello Marco, This is inform that I am resending a copy of the mail send before as I forgot to CC the author of the referred paper who is working in the direction of development of QA system using DL as tool for language representation. I think Google Translator API would come handy to perform the conversion of an foreign language question to English language question for cases where particular knowledge or triple is missing [2] in given chapter and then after firing it against English Knowledge Base which do have one. However, there is a paper on Wh-questions by Dr. Sourish Dasgupta (cc) and its possible semantic formalization in description logics [1]. - Ankur. Reference : --- [1] http://arxiv.org/abs/1312.6948 [2] Approach taken in QAKiS. On Fri, Feb 28, 2014 at 4:54 PM, Marco Fossati hell.j@gmail.com mailto:hell.j@gmail.com wrote: Also, I think a crucial point will be the multilingual capabilities of the tool. In this way, all the DBpedia chapters can benefit from it. So, the first implementation should focus on very simple questions, but in multiple languages. WH questions would be great. Of course, this requires language-specific validation. We will definitely need the help of the worldwide community. Sounds like the project is getting more and more exciting! Cheers, On 2/28/14, 12:08 PM, Marco Fossati wrote: Hi Ankur, On 2/28/14, 2:00 AM, Ankur Padia wrote: Among the approach listed before, I will prefer TBSL as it scope is relatively wider. All right, go ahead with that. Ideally QA engine for DBpedia should be able to handle all kinds of question with its appropriate semantic parsing and satisfactory conversion to SPARQL queries. To address the scope for a QA system, it would highly depend on the time at hand. For example given a span of GSoC, addressing even a small number of English nuances in queries would be ambitious (Correct me if I am wrong). Exactly, keep in mind that a successful project implies a tool that actually works. Hence, I suggest to proceed first with the implementation of single predicate queries, in order to provide a reasonable coverage of simple questions. Cheers, -- Marco Fossati http://about.me/marco.fossati Twitter: @hjfocs Skype: hell_j -- Flow-based real-time traffic analytics
Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)
Hello Marco, This is inform that I am resending a copy of the mail send before as I forgot to CC the author of the referred paper who is working in the direction of development of QA system using DL as tool for language representation. I think Google Translator API would come handy to perform the conversion of an foreign language question to English language question for cases where particular knowledge or triple is missing [2] in given chapter and then after firing it against English Knowledge Base which do have one. However, there is a paper on Wh-questions by Dr. Sourish Dasgupta (cc) and its possible semantic formalization in description logics [1]. - Ankur. Reference : --- [1] http://arxiv.org/abs/1312.6948 [2] Approach taken in QAKiS. On Fri, Feb 28, 2014 at 4:54 PM, Marco Fossati hell.j@gmail.com wrote: Also, I think a crucial point will be the multilingual capabilities of the tool. In this way, all the DBpedia chapters can benefit from it. So, the first implementation should focus on very simple questions, but in multiple languages. WH questions would be great. Of course, this requires language-specific validation. We will definitely need the help of the worldwide community. Sounds like the project is getting more and more exciting! Cheers, On 2/28/14, 12:08 PM, Marco Fossati wrote: Hi Ankur, On 2/28/14, 2:00 AM, Ankur Padia wrote: Among the approach listed before, I will prefer TBSL as it scope is relatively wider. All right, go ahead with that. Ideally QA engine for DBpedia should be able to handle all kinds of question with its appropriate semantic parsing and satisfactory conversion to SPARQL queries. To address the scope for a QA system, it would highly depend on the time at hand. For example given a span of GSoC, addressing even a small number of English nuances in queries would be ambitious (Correct me if I am wrong). Exactly, keep in mind that a successful project implies a tool that actually works. Hence, I suggest to proceed first with the implementation of single predicate queries, in order to provide a reasonable coverage of simple questions. Cheers, -- Marco Fossati http://about.me/marco.fossati Twitter: @hjfocs Skype: hell_j -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___ Dbpedia-gsoc mailing list Dbpedia-gsoc@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)
Also, I think a crucial point will be the multilingual capabilities of the tool. In this way, all the DBpedia chapters can benefit from it. So, the first implementation should focus on very simple questions, but in multiple languages. WH questions would be great. Of course, this requires language-specific validation. We will definitely need the help of the worldwide community. Sounds like the project is getting more and more exciting! Cheers, On 2/28/14, 12:08 PM, Marco Fossati wrote: Hi Ankur, On 2/28/14, 2:00 AM, Ankur Padia wrote: Among the approach listed before, I will prefer TBSL as it scope is relatively wider. All right, go ahead with that. Ideally QA engine for DBpedia should be able to handle all kinds of question with its appropriate semantic parsing and satisfactory conversion to SPARQL queries. To address the scope for a QA system, it would highly depend on the time at hand. For example given a span of GSoC, addressing even a small number of English nuances in queries would be ambitious (Correct me if I am wrong). Exactly, keep in mind that a successful project implies a tool that actually works. Hence, I suggest to proceed first with the implementation of single predicate queries, in order to provide a reasonable coverage of simple questions. Cheers, -- Marco Fossati http://about.me/marco.fossati Twitter: @hjfocs Skype: hell_j -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Dbpedia-gsoc mailing list Dbpedia-gsoc@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)
Hello Marco, I think Google Translator API would come handy to perform the conversion of an foreign language question to English language question for cases where particular knowledge or triple is missing [2] in given chapter and then after firing it against English Knowledge Base which do have one. However, there is a paper on Wh questions and its possible semantic formalization in description logics [1] (just for information, no promotion). - Ankur. Reference : --- [1] http://arxiv.org/abs/1312.6948 [2] Approach taken in QAKiS. On Fri, Feb 28, 2014 at 4:54 PM, Marco Fossati hell.j@gmail.com wrote: Also, I think a crucial point will be the multilingual capabilities of the tool. In this way, all the DBpedia chapters can benefit from it. So, the first implementation should focus on very simple questions, but in multiple languages. WH questions would be great. Of course, this requires language-specific validation. We will definitely need the help of the worldwide community. Sounds like the project is getting more and more exciting! Cheers, On 2/28/14, 12:08 PM, Marco Fossati wrote: Hi Ankur, On 2/28/14, 2:00 AM, Ankur Padia wrote: Among the approach listed before, I will prefer TBSL as it scope is relatively wider. All right, go ahead with that. Ideally QA engine for DBpedia should be able to handle all kinds of question with its appropriate semantic parsing and satisfactory conversion to SPARQL queries. To address the scope for a QA system, it would highly depend on the time at hand. For example given a span of GSoC, addressing even a small number of English nuances in queries would be ambitious (Correct me if I am wrong). Exactly, keep in mind that a successful project implies a tool that actually works. Hence, I suggest to proceed first with the implementation of single predicate queries, in order to provide a reasonable coverage of simple questions. Cheers, -- Marco Fossati http://about.me/marco.fossati Twitter: @hjfocs Skype: hell_j -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___ Dbpedia-gsoc mailing list Dbpedia-gsoc@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc