[jira] [Created] (SOLR-6689) Ability to partially update multiple documents with a query
Siddharth Gargate created SOLR-6689: --- Summary: Ability to partially update multiple documents with a query Key: SOLR-6689 URL: https://issues.apache.org/jira/browse/SOLR-6689 Project: Solr Issue Type: New Feature Affects Versions: 4.10.2 Reporter: Siddharth Gargate SOLR allows us to update parts in document, but it is limited to single document specified by the ID. We should be able to update multiple documents with a specified query. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-666) TERM1 OR NOT TERM2 does not perform as expected
[ https://issues.apache.org/jira/browse/LUCENE-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12767802#action_12767802 ] Siddharth Gargate commented on LUCENE-666: -- Can we rewrite the query (A OR NOT B) to NOT(NOT(A) AND B) to solve this issue? TERM1 OR NOT TERM2 does not perform as expected --- Key: LUCENE-666 URL: https://issues.apache.org/jira/browse/LUCENE-666 Project: Lucene - Java Issue Type: Bug Components: QueryParser Affects Versions: 2.0.0 Environment: Windows XP, JavaCC 4.0, JDK 1.5 Reporter: Dejan Nenov Attachments: TestAornotB.java test: [junit] Testsuite: org.apache.lucene.search.TestAornotB [junit] Tests run: 3, Failures: 1, Errors: 0, Time elapsed: 0.39 sec [junit] - Standard Output --- [junit] Doc1 = A B C [junit] Doc2 = A B C D [junit] Doc3 = A C D [junit] Doc4 = B C D [junit] Doc5 = C D [junit] - [junit] With query A OR NOT B we expect to hit [junit] all documents EXCEPT Doc4, instead we only match on Doc3. [junit] While LUCENE currently explicitly does not support queries of [junit] the type find docs that do not contain TERM - this explains [junit] not finding Doc5, but does not justify elimnating Doc1 and Doc2 [junit] - [junit] the fix shoould likely require a modification to QueryParser.jj [junit] around the method: [junit] protected void addClause(Vector clauses, int conj, int mods, Query q) [junit] Query:c:a -c:b hits.length=1 [junit] Query Found:Doc[0]= A C D [junit] 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited clause(s) [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 1), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=1) [junit] 0.0 = match on prohibited clause (c:b) [junit] 0.6115718 = (MATCH) fieldWeight(c:b in 1), product of: [junit] 1.0 = tf(termFreq(c:b)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=1) [junit] 0.6115718 = (MATCH) sum of: [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 2), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=2) [junit] 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited clause(s) [junit] 0.0 = match on prohibited clause (c:b) [junit] 0.6115718 = (MATCH) fieldWeight(c:b in 3), product of: [junit] 1.0 = tf(termFreq(c:b)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=3) [junit] Query:c:a (-c:b) hits.length=3 [junit] Query Found:Doc[0]= A B C [junit] Query Found:Doc[1]= A B C D [junit] Query Found:Doc[2]= A C D [junit] 0.3057859 = (MATCH) product of: [junit] 0.6115718 = (MATCH) sum of: [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 1), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=1) [junit] 0.5 = coord(1/2) [junit] 0.3057859 = (MATCH) product of: [junit] 0.6115718 = (MATCH) sum of: [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 2), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=2) [junit] 0.5 = coord(1/2) [junit] 0.0 = (NON-MATCH) product of: [junit] 0.0 = (NON-MATCH) sum of: [junit] 0.0 = coord(0/2) [junit] - --- [junit] Testcase: testFAIL(org.apache.lucene.search.TestAornotB): FAILED [junit] resultDocs =A C D expected:3 but was:1 [junit] junit.framework.AssertionFailedError: resultDocs =A C D expected:3 but was:1 [junit] at org.apache.lucene.search.TestAornotB.testFAIL(TestAornotB.java:137) [junit] Test org.apache.lucene.search.TestAornotB FAILED -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1150) OutofMemoryError on enabling highlighting
[ https://issues.apache.org/jira/browse/SOLR-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12711952#action_12711952 ] Siddharth Gargate commented on SOLR-1150: - Thanks Mark. SolrIndexSearcher.readDocs method internally reads one doc at a time. So there shouldn't be any performance loss. OutofMemoryError on enabling highlighting - Key: SOLR-1150 URL: https://issues.apache.org/jira/browse/SOLR-1150 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 1.4 Reporter: Siddharth Gargate Fix For: 1.4 Attachments: SOLR-1150.patch Please refer to following mail thread http://markmail.org/message/5nhkm5h3ongqlput I am testing with 2MB document size and just 500 documents. Indexing is working fine even with 128MB heap size. But on searching Solr throws OOM error. This issue is observed only when we enable highlighting. While indexing I am storing 1 MB text. While searching Solr reads all the 500 documents in the memory. It also reads the complete 1 MB stored field in the memory for all 500 documents. Due to this 500 docs * 1 MB * 2 (2 bytes per char) = 1000 MB memory is required for searching. This memory usage can be reduced by reading one document at a time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1150) OutofMemoryError on enabling highlighting
OutofMemoryError on enabling highlighting - Key: SOLR-1150 URL: https://issues.apache.org/jira/browse/SOLR-1150 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4 Reporter: Siddharth Gargate Fix For: 1.4 Please refer to following mail thread http://markmail.org/message/5nhkm5h3ongqlput I am testing with 2MB document size and just 500 documents. Indexing is working fine even with 128MB heap size. But on searching Solr throws OOM error. This issue is observed only when we enable highlighting. While indexing I am storing 1 MB text. While searching Solr reads all the 500 documents in the memory. It also reads the complete 1 MB stored field in the memory for all 500 documents. Due to this 500 docs * 1 MB * 2 (2 bytes per char) = 1000 MB memory is required for searching. This memory usage can be reduced by reading one document at a time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1150) OutofMemoryError on enabling highlighting
[ https://issues.apache.org/jira/browse/SOLR-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Gargate updated SOLR-1150: Issue Type: Improvement (was: Bug) OutofMemoryError on enabling highlighting - Key: SOLR-1150 URL: https://issues.apache.org/jira/browse/SOLR-1150 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 1.4 Reporter: Siddharth Gargate Fix For: 1.4 Please refer to following mail thread http://markmail.org/message/5nhkm5h3ongqlput I am testing with 2MB document size and just 500 documents. Indexing is working fine even with 128MB heap size. But on searching Solr throws OOM error. This issue is observed only when we enable highlighting. While indexing I am storing 1 MB text. While searching Solr reads all the 500 documents in the memory. It also reads the complete 1 MB stored field in the memory for all 500 documents. Due to this 500 docs * 1 MB * 2 (2 bytes per char) = 1000 MB memory is required for searching. This memory usage can be reduced by reading one document at a time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (TIKA-208) Special characters in HTML file are not parsed correctly
Special characters in HTML file are not parsed correctly - Key: TIKA-208 URL: https://issues.apache.org/jira/browse/TIKA-208 Project: Tika Issue Type: Bug Components: parser Affects Versions: 0.3 Reporter: Siddharth Gargate Words containing ä, ö characters are not parsed correctly if present in HTML document. Please refer to discussion: http://markmail.org/message/jgwzbw63o67amqu3 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.