Re: alerting system with Solr's Streaming Expressions

Susheel Kumar Thu, 09 Feb 2017 08:25:36 -0800

got it, Thanks, Joel.

On Thu, Feb 9, 2017 at 11:17 AM, Susheel Kumar <susheel2...@gmail.com>
wrote:


> I increased from 250 to 2500 and 100 to 1000 when did't get expected
> result.  Let me put more examples.
>
> Thanks,
> Susheel
>
> On Thu, Feb 9, 2017 at 11:03 AM, Joel Bernstein <joels...@gmail.com>
> wrote:
>
>> A few things that I see right off:
>>
>> 1) 2500 terms is too many. I was testing with 100-250 terms
>> 2) 1000 iterations is to high. If the model hasn't converged by 100
>> iterations it's likely not going to converge.
>> 3) You're going to need more examples. You may want to run features first
>> and see what it selects. Then you need multiple examples for each feature.
>> I was testing with the enron ham/spam data set. It would be good to
>> download that dataset and see what that looks like.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Thu, Feb 9, 2017 at 10:15 AM, Susheel Kumar <susheel2...@gmail.com>
>> wrote:
>>
>> > Hello Joel,
>> >
>> > Here is the final iteration in json format.
>> >
>> >  https://www.dropbox.com/s/g3a3606ms6cu8q4/final_iteration.json?dl=0
>> >
>> > Below is the expression used
>> >
>> > update(models,
>> >              batchSize="50",
>> >              train(trainingSet,
>> >                       features(trainingSet,
>> >                                      q="*:*",
>> >                                      featureSet="threatFeatures",
>> >                                      field="body_txt",
>> >                                      outcome="out_i",
>> >                                      numTerms=2500),
>> >                       q="*:*",
>> >                       name="threatModel",
>> >                       field="body_txt",
>> >                       outcome="out_i",
>> >                       maxIterations="1000"))
>> >
>> > I just have 16 documents with 8+ve and 8-ves. The field which contains
>> the
>> > feedback is body_txt (text_general type)
>> >
>> > Thanks for looking.
>> >
>> >
>> >
>> > On Wed, Feb 8, 2017 at 7:52 AM, Joel Bernstein <joels...@gmail.com>
>> wrote:
>> >
>> > > Can you post the final iteration of the model?
>> > >
>> > > Also the expression you used to train the model?
>> > >
>> > > How much training data do you have? Ho many positive examples and
>> > negatives
>> > > examples?
>> > >
>> > > Joel Bernstein
>> > > http://joelsolr.blogspot.com/
>> > >
>> > > On Tue, Feb 7, 2017 at 2:14 PM, Susheel Kumar <susheel2...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > I am tried to follow http://joelsolr.blogspot.com/ to see if we can
>> > > > classify positive & negative feedbacks using streaming expressions.
>> > All
>> > > > works but end result where probability_d result of classify
>> expression
>> > > > gives similar results for positive / negative feedback. See below
>> > > >
>> > > > What I may be missing here.  Do i need to put more data in training
>> set
>> > > or
>> > > > something else?
>> > > >
>> > > >
>> > > > { "result-set": { "docs": [ { "body_txt": [ "love the company" ],
>> > > > "score_d": 2.1892474120319667, "id": "6", "probability_d":
>> > > > 0.977944433135261 }, { "body_txt": [ "bad experience " ], "score_d":
>> > > > 3.1689453250842914, "id": "5", "probability_d": 0.9888109278133054
>> }, {
>> > > > "body_txt": [ "This company rewards its employees, but you should
>> only
>> > > work
>> > > > here if you truly love sales. The stress of the job can get to you
>> and
>> > > they
>> > > > definitely push you." ], "score_d": 4.621702323888672, "id": "4",
>> > > > "probability_d": 0.9999999999898557 }, { "body_txt": [ "no chance
>> for
>> > > > advancement with that company every year I was there it got worse I
>> > don't
>> > > > know if all branches of adp but Florence organization was turn over
>> > rate
>> > > > would be higher if it was for temp workers" ], "score_d":
>> > > > 5.288898825826228, "id": "3", "probability_d": 0.9999999999999956
>> }, {
>> > > > "body_txt": [ "It was a pleasure to work at the Milpitas campus. The
>> > team
>> > > > that works there are professional and dedicated individuals. The
>> level
>> > of
>> > > > loyalty and dedication is impressive" ], "score_d":
>> 2.5303947056922937,
>> > > > "id": "2", "probability_d": 0.9999990430778418 },
>> > > >
>> > >
>> >
>>
>
>

Re: alerting system with Solr's Streaming Expressions

Reply via email to