Hello,

thanks for that.

Should we proceed with the contribution?

The next steps are roughly as follows:
- Create a jira issue for the contribution, and attach the source code to it
- Do a vote to accept it on the dev list
- Do IP clearance

IP clearance will most likely include signing these papers:
http://www.apache.org/licenses/software-grant.txt
http://www.apache.org/licenses/icla.txt
http://www.apache.org/licenses/cla-corporate.txt (if you work on the code during your day job)

After we are through these steps we can import the code into our subversion repository.

Jörn

On 8/15/11 8:39 PM, Boris Galitsky wrote:
Hi Jason and Jörn

I will briefly comment on how our approach is different from the authors 
below:http://www.cs.utexas.edu/~ai-lab/downloadPublication.php?filename=http://www.cs.utexas.edu/users/ml/papers/kim.coling10.pdf&citation=In+%3Ci%3EProceedings+of+the+23rd+International+Conference+on+Computational+Linguistics+%28COLING+2010%29%3C%2Fi%3E%2C+543--551%2C+Beijing%2C+China%2C+August+2010.Sure,
 having something that maps trees to logical forms would be useful.

Boris, I would recommend you look at papers in Ray Mooney's group on
semantic parsing:

http://www.cs.utexas.edu/~ml/publications/area/77/learning_for_semantic_parsing
"The authors align naturallanguage sentences to their correct meaning 
representations given the ambiguous supervision
provided by a grounded language acquisition scenario".This approach takes a 
vertical domain, applies statistical learning and learns to find a better meaning 
representation, taking into account, in particular, parsing information. Mooney's et 
al approach cant directly map a syntactic tree structure into a logic form 
'structure', at least it does not intend to do so.
If a vertical domain changes, one have to re-train. It is adequate for a 
robocap competition but not really for an industrial app in a horizontal 
domain, in my opinion.
What we are describing/proposing does not go as high semantically as Mooney et 
al, but it is domain - independent and is directly (in a structured, not 
statistical) way linked to syntactic parse tree, so a user does not have to 
worry about re-training. After training, if we have a fixed set of meaning 
(meaning representations in Mooneys' terms), his system would give a higher 
accuracy than ours, but his settings are not really plausible for industrial 
cases like search relevance and text relevance in a broader domain. What we 
observed is that overlap of syntactic tree, properly transformed, is usually 
good enough to accept/reject relevance
In particular, Ruifang Ge (who is now at Facebook) did phrase structure to
logical form learning:
http://www.cs.utexas.edu/~ai-lab/pub-view.php?PubID=126959

I definitely enjoyed  reading the phd thesis, nice survey part! Earlier work of Mooney at 
al used Inductive Logic Programming to learn commonalities between syntactic structure. 
Our approach kind of takes it to extreme: syntactic parse trees are considered a special 
case of logic formulas and Inductive Logic Programming 's anti-unification is defined 
DIRECTLY on syntactic parse trees.I am more skeptical about universality of 'semantic 
grammar' unless we focus on a given text classification domain. So my understanding is 
lets not go too far up in semantic representation unless the classification domain is 
fixed, there is no such thing as most accurate semantic representation for everything 
(unless we are in a so restricted domain as specific database querying). So I can see  
"Meaning Representation Language Grammar" as a different component of openNLP, 
but it is hard for me to see how a search engineer (not a linguist) can just plug it in 
and leverage it in an industrial application.

RegardsBoris                                    


Reply via email to