Re: is the classes ended with PerThread(*PerThread) multithread

2010-12-28 Thread Simon Willnauer
Hey there,

so what you are looking at are classes that are created per Thread
rather than shared with other threads. Lucene internally rarely
creates threads or subclasses Thread, Runnable or Callable
(ParallelMultiSearcher is an exception or some of the merging code).
Yet, inside the indexer when you add (update) a document Lucene
utilizes the callers thread rather than spanning a new one. When you
look at DocumentsWriter.java there should be a method callled
getThreadState. Each indexing thread, lets say in updateDocument, gets
its Thread-Private DocumentsWriterThreadState. This thread state holds
a DocConsumerPerThread obtained from the DocumentsWriters DocConsumer
(see the indexing chain). DocConsumerPerThread in that case is some
kind of decorator that hold other DocConsumerPerThread instances like
TermsHashPerThread etc.

The general pattern is for each DocConsumer you can get a
DocConsumerPerThread for your indexing thread which then consumes the
document you are processing right now.

I hope that helps

simon


On Tue, Dec 28, 2010 at 4:19 AM, xu cheng xcheng@gmail.com wrote:
 hi all:
 I'm new to dev
 these days I'm reading the source code in the index package
 and I was confused.
 there are classes with suffix PerThread such as DocFieldProcessorPerThread,
 DocInverterPerThread, TermsHashPerThread, FreqProxTermWriterPerThread.
 in this mailing-list, I was told that they are multithreaded.
 however, there are some difficulties for me to understand!
 I see no sign that they inherited from the Thread , or implement the
 Runnable, or something else??
 how do they map to the OS thread??
 thanks ^_^

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: is the classes ended with PerThread(*PerThread) multithread

2010-12-28 Thread xu cheng
hi simon

thanks for replying very much.

after reading the source code with your suggestion, here's my understanding,
and I don't know whether it's right:

the DocumentsWriter actually don't create threads, but the codes that
useDocumentsWriter can do the
multithreading(say, several threads call updateDocument). and each thread
has its DocumentsWriterThreadState, in the mean while,
each DocumentsWriterThreadState has its own objects(the *PerThread such as
DocFieldProcessorPerThread, DocInverterPerThread and so on )

as the methods of DocumentsWriter are called by multiple threads, for
example, 4 threads, there are 4 DocumentsWriterThreadState objects, and 4
index chains, ( each index chain has it's own *PerThread objects ,  to
process the document).

am I right??

thanks for replying again!



2010/12/28 Simon Willnauer simon.willna...@googlemail.com

 Hey there,

 so what you are looking at are classes that are created per Thread
 rather than shared with other threads. Lucene internally rarely
 creates threads or subclasses Thread, Runnable or Callable
 (ParallelMultiSearcher is an exception or some of the merging code).
 Yet, inside the indexer when you add (update) a document Lucene
 utilizes the callers thread rather than spanning a new one. When you
 look at DocumentsWriter.java there should be a method callled
 getThreadState. Each indexing thread, lets say in updateDocument, gets
 its Thread-Private DocumentsWriterThreadState. This thread state holds
 a DocConsumerPerThread obtained from the DocumentsWriters DocConsumer
 (see the indexing chain). DocConsumerPerThread in that case is some
 kind of decorator that hold other DocConsumerPerThread instances like
 TermsHashPerThread etc.

 The general pattern is for each DocConsumer you can get a
 DocConsumerPerThread for your indexing thread which then consumes the
 document you are processing right now.

 I hope that helps

 simon


 On Tue, Dec 28, 2010 at 4:19 AM, xu cheng xcheng@gmail.com wrote:
  hi all:
  I'm new to dev
  these days I'm reading the source code in the index package
  and I was confused.
  there are classes with suffix PerThread such as
 DocFieldProcessorPerThread,
  DocInverterPerThread, TermsHashPerThread, FreqProxTermWriterPerThread.
  in this mailing-list, I was told that they are multithreaded.
  however, there are some difficulties for me to understand!
  I see no sign that they inherited from the Thread , or implement the
  Runnable, or something else??
  how do they map to the OS thread??
  thanks ^_^

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: is the classes ended with PerThread(*PerThread) multithread

2010-12-28 Thread Simon Willnauer
On Tue, Dec 28, 2010 at 10:57 AM, xu cheng xcheng@gmail.com wrote:
 hi simon
 thanks for replying very much.
 after reading the source code with your suggestion, here's my understanding,
 and I don't know whether it's right:
 the DocumentsWriter actually don't create threads, but the codes that use
 DocumentsWriter can do the multithreading(say, several threads call
 updateDocument). and each thread has its DocumentsWriterThreadState, in the
 mean while, each DocumentsWriterThreadState has its own objects(the
 *PerThread such as DocFieldProcessorPerThread, DocInverterPerThread and so
 on )
 as the methods of DocumentsWriter are called by multiple threads, for
 example, 4 threads, there are 4 DocumentsWriterThreadState objects, and 4
 index chains, ( each index chain has it's own *PerThread objects ,  to
 process the document).
 am I right??

that sounds about right

simon
 thanks for replying again!


 2010/12/28 Simon Willnauer simon.willna...@googlemail.com

 Hey there,

 so what you are looking at are classes that are created per Thread
 rather than shared with other threads. Lucene internally rarely
 creates threads or subclasses Thread, Runnable or Callable
 (ParallelMultiSearcher is an exception or some of the merging code).
 Yet, inside the indexer when you add (update) a document Lucene
 utilizes the callers thread rather than spanning a new one. When you
 look at DocumentsWriter.java there should be a method callled
 getThreadState. Each indexing thread, lets say in updateDocument, gets
 its Thread-Private DocumentsWriterThreadState. This thread state holds
 a DocConsumerPerThread obtained from the DocumentsWriters DocConsumer
 (see the indexing chain). DocConsumerPerThread in that case is some
 kind of decorator that hold other DocConsumerPerThread instances like
 TermsHashPerThread etc.

 The general pattern is for each DocConsumer you can get a
 DocConsumerPerThread for your indexing thread which then consumes the
 document you are processing right now.

 I hope that helps

 simon


 On Tue, Dec 28, 2010 at 4:19 AM, xu cheng xcheng@gmail.com wrote:
  hi all:
  I'm new to dev
  these days I'm reading the source code in the index package
  and I was confused.
  there are classes with suffix PerThread such as
  DocFieldProcessorPerThread,
  DocInverterPerThread, TermsHashPerThread, FreqProxTermWriterPerThread.
  in this mailing-list, I was told that they are multithreaded.
  however, there are some difficulties for me to understand!
  I see no sign that they inherited from the Thread , or implement the
  Runnable, or something else??
  how do they map to the OS thread??
  thanks ^_^

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: is the classes ended with PerThread(*PerThread) multithread

2010-12-28 Thread Earwin Burrfoot
There is a single indexchain, with a single instance of each chain
component, except those ending in -PerThread.

Though that's gonna change with
https://issues.apache.org/jira/browse/LUCENE-2324

On Tue, Dec 28, 2010 at 13:10, Simon Willnauer
simon.willna...@googlemail.com wrote:
 On Tue, Dec 28, 2010 at 10:57 AM, xu cheng xcheng@gmail.com wrote:
 hi simon
 thanks for replying very much.
 after reading the source code with your suggestion, here's my understanding,
 and I don't know whether it's right:
 the DocumentsWriter actually don't create threads, but the codes that use
 DocumentsWriter can do the multithreading(say, several threads call
 updateDocument). and each thread has its DocumentsWriterThreadState, in the
 mean while, each DocumentsWriterThreadState has its own objects(the
 *PerThread such as DocFieldProcessorPerThread, DocInverterPerThread and so
 on )
 as the methods of DocumentsWriter are called by multiple threads, for
 example, 4 threads, there are 4 DocumentsWriterThreadState objects, and 4
 index chains, ( each index chain has it's own *PerThread objects ,  to
 process the document).
 am I right??

 that sounds about right

 simon
 thanks for replying again!


 2010/12/28 Simon Willnauer simon.willna...@googlemail.com

 Hey there,

 so what you are looking at are classes that are created per Thread
 rather than shared with other threads. Lucene internally rarely
 creates threads or subclasses Thread, Runnable or Callable
 (ParallelMultiSearcher is an exception or some of the merging code).
 Yet, inside the indexer when you add (update) a document Lucene
 utilizes the callers thread rather than spanning a new one. When you
 look at DocumentsWriter.java there should be a method callled
 getThreadState. Each indexing thread, lets say in updateDocument, gets
 its Thread-Private DocumentsWriterThreadState. This thread state holds
 a DocConsumerPerThread obtained from the DocumentsWriters DocConsumer
 (see the indexing chain). DocConsumerPerThread in that case is some
 kind of decorator that hold other DocConsumerPerThread instances like
 TermsHashPerThread etc.

 The general pattern is for each DocConsumer you can get a
 DocConsumerPerThread for your indexing thread which then consumes the
 document you are processing right now.

 I hope that helps

 simon


 On Tue, Dec 28, 2010 at 4:19 AM, xu cheng xcheng@gmail.com wrote:
  hi all:
  I'm new to dev
  these days I'm reading the source code in the index package
  and I was confused.
  there are classes with suffix PerThread such as
  DocFieldProcessorPerThread,
  DocInverterPerThread, TermsHashPerThread, FreqProxTermWriterPerThread.
  in this mailing-list, I was told that they are multithreaded.
  however, there are some difficulties for me to understand!
  I see no sign that they inherited from the Thread , or implement the
  Runnable, or something else??
  how do they map to the OS thread??
  thanks ^_^

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



is the classes ended with PerThread(*PerThread) multithread

2010-12-27 Thread xu cheng
hi all:
I'm new to dev
these days I'm reading the source code in the index package
and I was confused.
there are classes with suffix PerThread such as DocFieldProcessorPerThread,
DocInverterPerThread, TermsHashPerThread, FreqProxTermWriterPerThread.

in this mailing-list, I was told that they are multithreaded.
however, there are some difficulties for me to understand!
I see no sign that they inherited from the Thread , or implement the
Runnable, or something else??

how do they map to the OS thread??

thanks ^_^