Hi, Sahar, I don't know any open source rule based parser, but probably there is.
I only know a proprietary one called EngGram, which is built with Constraint Grammar (GPL). You can use EngGram online here: http://beta.visl.sdu.dk/visl/en/parsing/automatic/ Regards, William On Fri, May 3, 2013 at 12:06 PM, Sahar Ebadi <[email protected]>wrote: > Hi, > Thanks for the replies! :) > > Lance: > 1)yes, I use sentence detector just to split the text in to sentences and I > am not taking them as like they are Valid sentences. > 2)Watson goes beyond what I need. I only need to find good/valid sentences > in the text(only NLP, does not include reasoning and information retrival > as watson does). > 3)I know there should be some semi-effective solutions but I am not able to > find them. can you give me some keywords or short explanation on some of > them? that would be a greaat help!! > > So what I have done: > the only solution I found was to parse the sentence and then check to see > if it follows the standard grammatical pattern of a sentence. If so it is a > valid sentence otherwise it is not a valid sentence. so far, I have parsed > the sentences using Open NLP which is tagged based on penn treebank. now I > need to know if there is any standard sentence pattern which is based on > penn treebank? > > Ryan: the result will not be accurate enough. > > Willian: can you pass me the name of some rule-based parser you have in > mind? (especially those compatible with OPEN NLP) > > I really appreciate any suggestions on this. > > Thank you all so much! > > > On Wed, May 1, 2013 at 5:34 PM, Lance Norskog <[email protected]> wrote: > > > The "sentence detector" is for tokenizing (breaking text into words), not > > analysis. > > > > The 'brute force' approach for removing non-english texts is to search > for > > higher-page Unicode. If it's over 255, it's not english. (Except maybe > for > > currency.) > > > > What you're talking about are semantically deep problems that have a lot > > of semi-effective solutions. How deep do you want this analysis to be? > How > > close to IBM Watson do you expect to get? > > > > > > On 04/30/2013 06:43 AM, Sahar Ebadi wrote: > > > >> Hi all, > >> > >> lets say I have a text and I would like to detect only "good sentences". > >> by > >> "good sentences" I mean sentences that are 1)complete( grammatically > >> 2)have meaning 3)are in English language. > >> > >> As far as I found Open NLP sentence detector only detects sentences > >> according to punctuation(and a list of acronyms it has), so there is > >> no guarantee that the sentences are real, complete and meaningful > >> sentences. > >> > >> Now my question is is there any process in NLP that can help me to : > >> > >> 1)find grammatically complete sentences? > >> 2)find if a sentence has meaning or no? > >> 3)filter non-english texts? > >> > >> any suggestions or sharing useful resources is highly appreciated! > >> > >> Thanks. > >> > >> > > >
