Hi,
Thanks for the replies! :)

Lance:
1)yes, I use sentence detector just to split the text in to sentences and I
am not taking them as like they are Valid sentences.
2)Watson goes beyond what I need. I only need to find good/valid sentences
in the text(only NLP, does not include reasoning and information retrival
as watson does).
3)I know there should be some semi-effective solutions but I am not able to
find them. can you give me some keywords or short explanation on some of
them? that would be a greaat help!!

So what I have done:
the only solution I found was to parse the sentence and then check to see
if it follows the standard grammatical pattern of a sentence. If so it is a
valid sentence otherwise it is not a valid sentence. so far, I have parsed
the sentences using Open NLP which is tagged based on penn treebank. now I
need to know if there is any standard sentence pattern which is based on
penn treebank?

Ryan: the result will not be accurate enough.

Willian: can you pass me the name of some rule-based parser you have in
mind? (especially those compatible with OPEN NLP)

I really appreciate any suggestions on this.

Thank you all so much!


On Wed, May 1, 2013 at 5:34 PM, Lance Norskog <[email protected]> wrote:

> The "sentence detector" is for tokenizing (breaking text into words), not
> analysis.
>
> The 'brute force' approach for removing non-english texts is to search for
> higher-page Unicode. If it's over 255, it's not english. (Except maybe for
> currency.)
>
> What you're talking about are semantically deep problems that have a lot
> of semi-effective solutions. How deep do you want this analysis to be? How
> close to IBM Watson do you expect to get?
>
>
> On 04/30/2013 06:43 AM, Sahar Ebadi wrote:
>
>> Hi all,
>>
>> lets say I have a text and I would like to detect only "good sentences".
>> by
>> "good sentences" I mean sentences that are 1)complete( grammatically
>> 2)have meaning 3)are in English language.
>>
>> As far as I found Open NLP sentence detector only detects sentences
>> according to punctuation(and a list of acronyms it has), so there is
>> no guarantee that the sentences are real, complete and meaningful
>> sentences.
>>
>> Now my question is is there any process in NLP that can help me to :
>>
>> 1)find grammatically complete sentences?
>> 2)find if a sentence has meaning or no?
>> 3)filter non-english texts?
>>
>> any suggestions or sharing useful resources is highly appreciated!
>>
>> Thanks.
>>
>>
>

Reply via email to