Re: Continuous Query
unsubscribe From: narges saleh Sent: Sunday, 4 October 2020 2:03 AM To: user@ignite.apache.org Subject: Re: Continuous Query The latter; the server needs to perform some calculations on the data without sending any notification to the app. On Fri, Oct 2, 2020 at 4:25 PM Denis Magda mailto:dma...@apache.org>> wrote: And after you detect a record that satisfies the condition, do you need to send any notification to the application? Or is it more like a server detects and does some calculation logically without updating the app. - Denis On Fri, Oct 2, 2020 at 11:22 AM narges saleh mailto:snarges...@gmail.com>> wrote: The detection should happen at most a couple of minutes after a record is inserted in the cache but all the detections are local to the node. But some records with the current timestamp might show up in the system with big delays. On Fri, Oct 2, 2020 at 12:23 PM Denis Magda mailto:dma...@apache.org>> wrote: What are your requirements? Do you need to process the records as soon as they are put into the cluster? On Friday, October 2, 2020, narges saleh mailto:snarges...@gmail.com>> wrote: Thank you Dennis for the reply. >From the perspective of performance/resource overhead and reliability, which >approach is preferable? Does a continuous query based approach impose a lot >more overhead? On Fri, Oct 2, 2020 at 9:52 AM Denis Magda mailto:dma...@apache.org>> wrote: Hi Narges, Use continuous queries if you need to be notified in real-time, i.e. 1) a record is inserted, 2) the continuous filter confirms the record's time satisfies your condition, 3) the continuous queries notifies your application that does require processing. The jobs are better for a batching use case when it's ok to process records together with some delay. - Denis On Fri, Oct 2, 2020 at 3:50 AM narges saleh mailto:snarges...@gmail.com>> wrote: Hi All, If I want to watch for a rolling timestamp pattern in all the records that get inserted to all my caches, is it more efficient to use timer based jobs (that checks all the records in some interval) or continuous queries that locally filter on the pattern? These records can get inserted in any order and some can arrive with delays. An example is to watch for all the records whose timestamp ends in 50, if the timestamp is in the format -mm-dd hh:mi. thanks -- - Denis This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
RE: Preprocessing of data to use in Naive-Bayes
Hi, If there in any update, Please let me know. Thanks From: Priya Yadav Sent: Sunday, September 6, 2020 8:14 PM To: Alexey Zinoviev Cc: user@ignite.apache.org Subject: RE: Preprocessing of data to use in Naive-Bayes Hi Alexey, I am stuck on the preprocessing step itself as I am not able to find any api which takes the sentence , reads the tokens and calculate their count whereas scikit-learn provides the apis out of the box. I am attaching the sample data that I need to categorize on the basis of user experience. Please find the python code snippet below: from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizer classifier = MultinomialNB(); vect=CountVectorizer(); counts=vect.fit_transform(["pizza was soft, very nice"," good ambience and excellent service","tool a long time, service needs improvement","toppings were very less, but bread was excellent"]) ; counts=vect.fit_transform(comment); targets = ['Good Experience','Good Experience','Bad Experience','Good Experience']; classifier.fit(counts,targets); predictComments = [“soft bread, nice toppings”] predictData=vect.transform(predictComments); predictions = classifier.predict(predictData) print(predictions); Thanks, Priya From: Alexey Zinoviev mailto:zaleslaw@gmail.com>> Sent: Sunday, September 6, 2020 6:41 PM To: Igor Belyakov mailto:igor.belyako...@gmail.com>> Cc: user mailto:user@ignite.apache.org>> Subject: Re: Preprocessing of data to use in Naive-Bayes Very interesting case! We have 3 different implementations for NaiveBayes algorithm https://apacheignite.readme.io/docs/naive-bayes<https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_naive-2Dbayes=DwMFaQ=ObqWq9831a7badpzAhIKIA=qixDeHnSzhtciDY_pRHc4x12Ip0suDtJCZ5Ce1zlWfQ=s_IECR0VZUJ9ds7ehfpq8i3L0GTFiHRJ3ghViHS6dE8=oCy265A-SLfh0-HlWoiLAaoxQoXI4w6qOJ_BgZh66Dg=> I suppose that this is the best for this task https://apacheignite.readme.io/docs/naive-bayes#discrete-bernoulli-naive-bayes<https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_naive-2Dbayes-23discrete-2Dbernoulli-2Dnaive-2Dbayes=DwMFaQ=ObqWq9831a7badpzAhIKIA=qixDeHnSzhtciDY_pRHc4x12Ip0suDtJCZ5Ce1zlWfQ=s_IECR0VZUJ9ds7ehfpq8i3L0GTFiHRJ3ghViHS6dE8=S0CrU7joi3OwZA5W7BunClUM8cv-m2HtQziDPhuDtlg=> Data should be prepared as Vectors in Ignite Cache to start training. Dear Priya Yadav, could you please provide code or pseudocode with how you populate your Ignite cache with sentences data, a few sentences will be useful too. Also will be useful, how could you solve this task in scikit-learn, I'll try to help with the preprocessing code for this case. Sincerely yours, Alexey пт, 4 сент. 2020 г. в 19:40, Igor Belyakov mailto:igor.belyako...@gmail.com>>: Alexey, Do you have any thoughts regarding that? Igor On Fri, Sep 4, 2020 at 10:03 AM Priya Yadav mailto:priyaya...@fico.com>> wrote: Hi, Problem Statement: I have a feedback sentences having words separated by spaces like normal English sentences. Using these sentences I need to classify into categories based on some keywords. How should I preprocess my data in order to use it in Naive-Bayes. Any leads would be helpful. Thanks in advance. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
RE: Preprocessing of data to use in Naive-Bayes
Hi Alexey, I am stuck on the preprocessing step itself as I am not able to find any api which takes the sentence , reads the tokens and calculate their count whereas scikit-learn provides the apis out of the box. I am attaching the sample data that I need to categorize on the basis of user experience. Please find the python code snippet below: from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizer classifier = MultinomialNB(); vect=CountVectorizer(); counts=vect.fit_transform(["pizza was soft, very nice"," good ambience and excellent service","tool a long time, service needs improvement","toppings were very less, but bread was excellent"]) ; counts=vect.fit_transform(comment); targets = ['Good Experience','Good Experience','Bad Experience','Good Experience']; classifier.fit(counts,targets); predictComments = [“soft bread, nice toppings”] predictData=vect.transform(predictComments); predictions = classifier.predict(predictData) print(predictions); Thanks, Priya From: Alexey Zinoviev Sent: Sunday, September 6, 2020 6:41 PM To: Igor Belyakov Cc: user Subject: Re: Preprocessing of data to use in Naive-Bayes Very interesting case! We have 3 different implementations for NaiveBayes algorithm https://apacheignite.readme.io/docs/naive-bayes<https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_naive-2Dbayes=DwMFaQ=ObqWq9831a7badpzAhIKIA=qixDeHnSzhtciDY_pRHc4x12Ip0suDtJCZ5Ce1zlWfQ=s_IECR0VZUJ9ds7ehfpq8i3L0GTFiHRJ3ghViHS6dE8=oCy265A-SLfh0-HlWoiLAaoxQoXI4w6qOJ_BgZh66Dg=> I suppose that this is the best for this task https://apacheignite.readme.io/docs/naive-bayes#discrete-bernoulli-naive-bayes<https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_naive-2Dbayes-23discrete-2Dbernoulli-2Dnaive-2Dbayes=DwMFaQ=ObqWq9831a7badpzAhIKIA=qixDeHnSzhtciDY_pRHc4x12Ip0suDtJCZ5Ce1zlWfQ=s_IECR0VZUJ9ds7ehfpq8i3L0GTFiHRJ3ghViHS6dE8=S0CrU7joi3OwZA5W7BunClUM8cv-m2HtQziDPhuDtlg=> Data should be prepared as Vectors in Ignite Cache to start training. Dear Priya Yadav, could you please provide code or pseudocode with how you populate your Ignite cache with sentences data, a few sentences will be useful too. Also will be useful, how could you solve this task in scikit-learn, I'll try to help with the preprocessing code for this case. Sincerely yours, Alexey пт, 4 сент. 2020 г. в 19:40, Igor Belyakov mailto:igor.belyako...@gmail.com>>: Alexey, Do you have any thoughts regarding that? Igor On Fri, Sep 4, 2020 at 10:03 AM Priya Yadav mailto:priyaya...@fico.com>> wrote: Hi, Problem Statement: I have a feedback sentences having words separated by spaces like normal English sentences. Using these sentences I need to classify into categories based on some keywords. How should I preprocess my data in order to use it in Naive-Bayes. Any leads would be helpful. Thanks in advance. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately. FeedbackData Description: FeedbackData
Preprocessing of data to use in Naive-Bayes
Hi, Problem Statement: I have a feedback sentences having words separated by spaces like normal English sentences. Using these sentences I need to classify into categories based on some keywords. How should I preprocess my data in order to use it in Naive-Bayes. Any leads would be helpful. Thanks in advance. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.