[jira] [Comment Edited] (LUCENE-6037) PendingTerm cannot be cast to PendingBlock

2014-11-02 Thread zhanlijun (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193740#comment-14193740
 ] 

zhanlijun edited comment on LUCENE-6037 at 11/2/14 11:29 AM:
-

   Lucene-spatial module changes are unrelated to the bug, because the bug 
also happens when I use the native lucene-spatial module. 
   lucene-spatial module is widely used in mobile internet applications of 
china. I have an application scenario is to calculate the distance between the 
user and all POIs in the city. However, when the number of POIs in one city 
more than 10, the distance calculation of lucene becomes very slow (more 
than 10ms). Lucene use spatial4j HaversineRAD to calculate the distance, and I 
have do a test on my computer (2.9GHz Intel Core i7, 8GB mem)
POI num |  time
5   |  7ms
10  |  14ms
100 |  144ms
  I did some simplified the distance calculation formula. This 
simplification greatly improve the computational efficiency under the premise 
of maintaining the use of precision. Here is the result of the test.
test point pair   | disSimplify(meter)  
|  distHaversineRAD(meter)|  diff(meter)
(39.941, 116.45) (39.94, 116.451)| 140.024276920| 
140.02851671981400  |  0.0
(39.96 116.45) (39.94, 116.40)   | 4804.113098854450| 4804.421153907680 
  |  0.3
(39.96, 116.45) (39.94, 117.30)  | 72438.90919479560| 72444.54071519510 
  |  5.6
(39.26, 115.25) (41.04, 117.30)  | 263516.676171262 | 263508.55921886700
  |  8.1

POI num |  time
5   | 0.1
10  | 0.3
100   | 4


was (Author: zhanlijun):
   Lucene-spatial module changes are unrelated to the bug, because the bug 
also happens when I use the native lucene-spatial module. 
   lucene-spatial module is widely used in mobile internet applications of 
china. I have an application scenario is to calculate the distance between the 
user and all POIs in the city. However, when the number of POIs in one city 
more than 10, the distance calculation of lucene becomes very slow (more 
than 10ms). Lucene use spatial4j HaversineRAD to calculate the distance, and I 
have do a test on my computer (2.9GHz Intel Core i7, 8GB mem)
POI num |  time
5w  |  7ms
10w |  14ms
100w|  144ms
  I did some simplified the distance calculation formula. This 
simplification greatly improve the computational efficiency under the premise 
of maintaining the use of precision. Here is the result of the test.
test point pair   | disSimplify(meter)  
|  distHaversineRAD(meter)|  diff(meter)
(39.941, 116.45)(39.94, 116.451) | 140.024276920| 
140.02851671981400  |  0.0
(39.96 116.45)(39.94, 116.40)| 4804.113098854450| 4804.421153907680 
  |  0.3
(39.96, 116.45)(39.94, 117.30)   | 72438.90919479560| 72444.54071519510 
  |  5.6
(39.26, 115.25)(41.04, 117.30)   | 263516.676171262 | 263508.55921886700
  |  8.1

POI num |  time
5w  | 0.1
10w | 0.3
100w| 4

 PendingTerm cannot be cast to PendingBlock
 --

 Key: LUCENE-6037
 URL: https://issues.apache.org/jira/browse/LUCENE-6037
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.3.1
 Environment: ubuntu 64bit
Reporter: zhanlijun
Priority: Critical
 Fix For: 4.3.1


 the error as follows:
 java.lang.ClassCastException: 
 org.apache.lucene.codecs.BlockTreeTermsWriter$PendingTerm cannot be cast to 
 org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock
 at 
 org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finish(BlockTreeTermsWriter.java:1014)
 at 
 org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:553)
 at 
 org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
 at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
 at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
 at 
 org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:493)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480)
 at 
 org.apache.lucene.index.DocumentsWriter.postUpdate(DocumentsWriter.java:378)
 at 
 org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
 at 
 org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1283)
  

[jira] [Comment Edited] (LUCENE-6037) PendingTerm cannot be cast to PendingBlock

2014-11-02 Thread zhanlijun (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194130#comment-14194130
 ] 

zhanlijun edited comment on LUCENE-6037 at 11/3/14 3:59 AM:


I found the cause of the problem. 
I have a application scenario that add a same document to the index by using 
indexwriter.addDocuments(). In order to improve the efficiency of indexing, I 
make the document into a static variable.  This way running very well in a 
single-threaded environment, however, when I use multiple-threads to operator 
indexwriter.addDocuments(), it cause the error. 
 
The solution: I make a new document for each thread, and the error would no 
longer be reappeared.


was (Author: zhanlijun):
I used multiple threads to add a single document (static variable), and it 
would cause this error.  After I corrected, the error would no longer be 
reappeared.

 PendingTerm cannot be cast to PendingBlock
 --

 Key: LUCENE-6037
 URL: https://issues.apache.org/jira/browse/LUCENE-6037
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.3.1
 Environment: ubuntu 64bit
Reporter: zhanlijun
Priority: Critical
 Fix For: 4.3.1


 the error as follows:
 java.lang.ClassCastException: 
 org.apache.lucene.codecs.BlockTreeTermsWriter$PendingTerm cannot be cast to 
 org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock
 at 
 org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finish(BlockTreeTermsWriter.java:1014)
 at 
 org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:553)
 at 
 org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
 at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
 at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
 at 
 org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:493)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480)
 at 
 org.apache.lucene.index.DocumentsWriter.postUpdate(DocumentsWriter.java:378)
 at 
 org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
 at 
 org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1283)
 at 
 org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1243)
 at 
 org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1228)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org