[jira] Commented: (LUCENE-622) Provide More of Lucene For Maven

2007-06-15 Thread Michael Busch (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505075
 ] 

Michael Busch commented on LUCENE-622:
--

Sami, thanks for your testing efforts!
I also ran some tests and all artifacts seem to be fine except lucene-bdb.
It has sleepycat je 1.7 as a dependency, but actually it needs the 
http://downloads.osafoundation.org/db/db-4.3.29.jar; to compile. Couldn't
find that one in a maven repository though.

 Provide More of Lucene For Maven
 

 Key: LUCENE-622
 URL: https://issues.apache.org/jira/browse/LUCENE-622
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Stephen Duncan Jr
Assignee: Michael Busch
 Fix For: 2.2

 Attachments: lucene-622.txt, LUCENE-622_NEW.patch, lucene-core.pom, 
 lucene-highlighter-2.0.0.pom, lucene-maven.patch, lucene-maven.tar.bz2, 
 test-project.tar.gz


 Please provide javadoc  source jars for lucene-core.  Also, please provide 
 the rest of lucene (the jars inside of contrib in the download bundle) if 
 possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Fwd: Call for Papers Opens for OS Summit Asia 2007

2007-06-15 Thread Erik Hatcher



Begin forwarded message:

From: J Aaron Farr [EMAIL PROTECTED]


Call for Papers Opens for OS Summit Asia 2007

The call for papers is now open for OS Summit Asia, to be held
November 26-30 at the Cyberport in Hong Kong.  This joint conference
between the Apache Software Foundation and the Eclipse Foundation will
be consist of two days of tutorials (Nov 26-27) and three days of
regular conference sessions (Nov 28-30).

The paper submission deadline is Friday, 13 July, 2007, Midnight PDT.

You may log in to the ApacheCon submission site to submit your
proposals.  Further details about the conference, submissions, and
fees can be found at:

  http://www.ossummit.com/cfp.html

Topics appropriate for submission include, but are not restricted to,
the following:

 * ASF-wide projects such as Apache HTTP server, Tomcat, Struts,
   Geronimo, mod_perl and XML Web Services

 * Eclipse-wide projects such as BI and Reporting Tools (BIRT), Web
   Tools Platform (WTP), Eclipse Modeling Framework (EMF), Data Tools
   Platform (DTP), Equinox and the Rich Client Platform (RCP)

 * Programming languages such as Java, Perl, Python, Ruby and PHP

 * Web development technologies and techniques including security,
   performance tuning, e-commerce and J2EE

 * New technologies and trends such as Web Services and Web 2.0

 * Open source community and business models, legal and marketing
   issues

 * Open source projects and activities in Asia, local efforts and case
   studies

Thanks and we hope to hear from you, and see you in Hong Kong!

--
The OSSummit Planners
[EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-622) Provide More of Lucene For Maven

2007-06-15 Thread Karl Wettin (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505286
 ] 

Karl Wettin commented on LUCENE-622:


Michael Busch - [15/Jun/07 12:41 AM ]
 It has sleepycat je 1.7 as a dependency, but actually it needs the
 http://downloads.osafoundation.org/db/db-4.3.29.jar; to compile. Couldn't
 find that one in a maven repository though. 

I don't know what the Apache foundations thinks about it, but hosting a maven
repo is a peice of cake. Also, I would not mind at all if the Hudson or 
something
would publish a nightly snapshot of the Lucene projects to that repo.



 Provide More of Lucene For Maven
 

 Key: LUCENE-622
 URL: https://issues.apache.org/jira/browse/LUCENE-622
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Stephen Duncan Jr
Assignee: Michael Busch
 Fix For: 2.2

 Attachments: lucene-622.txt, LUCENE-622_NEW.patch, lucene-core.pom, 
 lucene-highlighter-2.0.0.pom, lucene-maven.patch, lucene-maven.tar.bz2, 
 test-project.tar.gz


 Please provide javadoc  source jars for lucene-core.  Also, please provide 
 the rest of lucene (the jars inside of contrib in the download bundle) if 
 possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-622) Provide More of Lucene For Maven

2007-06-15 Thread Michael Busch (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505324
 ] 

Michael Busch commented on LUCENE-622:
--

 Also, I would not mind at all if the Hudson or something
 would publish a nightly snapshot of the Lucene projects to that repo.

Yes, we could think about doing that in the future.

For now I'm planning to commit this patch shortly for 2.2 unless there
are objections.

 Provide More of Lucene For Maven
 

 Key: LUCENE-622
 URL: https://issues.apache.org/jira/browse/LUCENE-622
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Stephen Duncan Jr
Assignee: Michael Busch
 Fix For: 2.2

 Attachments: lucene-622.txt, LUCENE-622_NEW.patch, lucene-core.pom, 
 lucene-highlighter-2.0.0.pom, lucene-maven.patch, lucene-maven.tar.bz2, 
 test-project.tar.gz


 Please provide javadoc  source jars for lucene-core.  Also, please provide 
 the rest of lucene (the jars inside of contrib in the download bundle) if 
 possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-843) improve how IndexWriter uses RAM to buffer added documents

2007-06-15 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-843:
--

Attachment: LUCENE-843.take8.patch

Attached latest patch.

I think this patch is ready to commit.  I will let it sit for a while
so people can review it.

We still need to do LUCENE-845 before it can be committed as is.

However one option instead would be to commit this patch, but leave
IndexWriter flushing by doc count by default and then later switch it
to flush by net RAM usage once LUCENE-845 is done.  I like this option
best.

All tests pass (I've re-enabled the disk full tests and fixed error
handling so they now pass) on Windows XP, Debian Linux and OS X.

Summary of the changes in this rev:

  * Finished cleaning up  commenting code

  * Exception handling: if there is a disk full or any other exception
while adding a document or flushing then the index is rolled back
to the last commit point.

  * Added more unit tests

  * Removed my profiling tool from the patch (not intended to be
committed)

  * Fixed a thread safety issue where if you flush by doc count you
would sometimes get more than the doc count at flush than you
requested.  I moved the thread synchronization for determining
flush time down into DocumentsWriter.

  * Also fixed thread safety of calling flush with one thread while
other threads are still adding documents.

  * The biggest change is: absorbed all merging logic back into
IndexWriter.

Previously in DocumentsWriter I was tracking my own
flushed/partial segments and merging them on my own (but using
SegmentMerger).  This makes DocumentsWriter much simpler: now its
sole purpose is to gather added docs and write a new segment.

This turns out to be a big win:

  - Code is much simpler (no duplication of merging
policy/logic)

  - 21-25% additional performance gain for autoCommit=false case
when stored fields  vectors are used

  - IndexWriter.close() no longer takes an unexpected long time to
close in autoCommit=false case

However I had to make a change to the index format to do this.
The basic idea is to allow multiple segments to share access to
the doc store (stored fields, vectors) index files.

The change is quite simple: FieldsReader/VectorsReader are now
told the doc offset that they should start from when seeking in
the index stream (this info is stored in SegmentInfo).  When
merging segments we don't merge the doc store files when all
segments are sharing the same ones (big performance gain), else,
we make a private copy of the doc store files (ie as segments
normally are on the trunk today).

The change is fully backwards compatible (I added a test case to
the backwards compatibility unit test to be sure) and the change
is only used when autoCommit=false.

When autoCommit=false, the writer will append stored fields /
vectors to a single set of files even though it is flushing normal
segments whenever RAM is full.  These normal segments all refer to
the single shared set of doc store files.  Then when segments
are merged, the newly merged segment has its own private doc
stores again.  So the sharing only occurs for the level 0
segments.

I still need to update fileformats doc with this change.


 improve how IndexWriter uses RAM to buffer added documents
 --

 Key: LUCENE-843
 URL: https://issues.apache.org/jira/browse/LUCENE-843
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.2
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Attachments: LUCENE-843.patch, LUCENE-843.take2.patch, 
 LUCENE-843.take3.patch, LUCENE-843.take4.patch, LUCENE-843.take5.patch, 
 LUCENE-843.take6.patch, LUCENE-843.take7.patch, LUCENE-843.take8.patch


 I'm working on a new class (MultiDocumentWriter) that writes more than
 one document directly into a single Lucene segment, more efficiently
 than the current approach.
 This only affects the creation of an initial segment from added
 documents.  I haven't changed anything after that, eg how segments are
 merged.
 The basic ideas are:
   * Write stored fields and term vectors directly to disk (don't
 use up RAM for these).
   * Gather posting lists  term infos in RAM, but periodically do
 in-RAM merges.  Once RAM is full, flush buffers to disk (and
 merge them later when it's time to make a real segment).
   * Recycle objects/buffers to reduce time/stress in GC.
   * Other various optimizations.
 Some of these changes are similar to how KinoSearch builds a segment.
 But, I haven't made any changes to Lucene's file format nor 

[jira] Reopened: (LUCENE-925) Analysis Package Level Javadocs

2007-06-15 Thread Doron Cohen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen reopened LUCENE-925:


 Assignee: Doron Cohen  (was: Grant Ingersoll)
Lucene Fields: [Patch Available]

Later than I'd hoped, an extended analysis package doc.
Reopening this issue b/c can't attache a file without re-opening.


 Analysis Package Level Javadocs
 ---

 Key: LUCENE-925
 URL: https://issues.apache.org/jira/browse/LUCENE-925
 Project: Lucene - Java
  Issue Type: Wish
  Components: Javadocs
Affects Versions: 2.2
Reporter: Grant Ingersoll
Assignee: Doron Cohen
Priority: Minor
 Fix For: 2.2

 Attachments: LUCENE-925-GSI-v2.patch, LUCENE-925.patch, 
 LUCENE-925.patch, LUCENE-925.patch


 Analysis package level javadocs need improving.  An overview of what an 
 Analyzer does, and maybe some sample code showing how to write you own 
 Analyzer, Tokenizer and TokenFilter would be really helpful.  Bonus would be 
 some discussion on best practices for achieving performance during analysis. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-925) Analysis Package Level Javadocs

2007-06-15 Thread Doron Cohen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-925:
---

Attachment: LUCENE-925.patch

Attached adds some code samples and adds these sections: 
- Invoking the Analyzer
- Indexing Analysis vs. Search Analysis
- - Field Section Boundaries
- - Token Position Increments

As today is the last day to commit javadoc issues I am hoping for some quick 
feedback...


 Analysis Package Level Javadocs
 ---

 Key: LUCENE-925
 URL: https://issues.apache.org/jira/browse/LUCENE-925
 Project: Lucene - Java
  Issue Type: Wish
  Components: Javadocs
Affects Versions: 2.2
Reporter: Grant Ingersoll
Assignee: Doron Cohen
Priority: Minor
 Fix For: 2.2

 Attachments: LUCENE-925-GSI-v2.patch, LUCENE-925.patch, 
 LUCENE-925.patch, LUCENE-925.patch, LUCENE-925.patch


 Analysis package level javadocs need improving.  An overview of what an 
 Analyzer does, and maybe some sample code showing how to write you own 
 Analyzer, Tokenizer and TokenFilter would be really helpful.  Bonus would be 
 some discussion on best practices for achieving performance during analysis. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-925) Analysis Package Level Javadocs

2007-06-15 Thread Michael Busch (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505367
 ] 

Michael Busch commented on LUCENE-925:
--

Doron,

I read the new sections... really good stuff and definitely useful!
I also like that you added more Wikipedia links.

My +1 for committing.

 Analysis Package Level Javadocs
 ---

 Key: LUCENE-925
 URL: https://issues.apache.org/jira/browse/LUCENE-925
 Project: Lucene - Java
  Issue Type: Wish
  Components: Javadocs
Affects Versions: 2.2
Reporter: Grant Ingersoll
Assignee: Doron Cohen
Priority: Minor
 Fix For: 2.2

 Attachments: LUCENE-925-GSI-v2.patch, LUCENE-925.patch, 
 LUCENE-925.patch, LUCENE-925.patch, LUCENE-925.patch


 Analysis package level javadocs need improving.  An overview of what an 
 Analyzer does, and maybe some sample code showing how to write you own 
 Analyzer, Tokenizer and TokenFilter would be really helpful.  Bonus would be 
 some discussion on best practices for achieving performance during analysis. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-843) improve how IndexWriter uses RAM to buffer added documents

2007-06-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505373
 ] 

Yonik Seeley commented on LUCENE-843:
-

 When merging segments we don't merge the doc store files when all segments 
 are sharing the same ones (big performance gain), 

Is this only in the case where the segments have no deleted docs?


 improve how IndexWriter uses RAM to buffer added documents
 --

 Key: LUCENE-843
 URL: https://issues.apache.org/jira/browse/LUCENE-843
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.2
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Attachments: LUCENE-843.patch, LUCENE-843.take2.patch, 
 LUCENE-843.take3.patch, LUCENE-843.take4.patch, LUCENE-843.take5.patch, 
 LUCENE-843.take6.patch, LUCENE-843.take7.patch, LUCENE-843.take8.patch


 I'm working on a new class (MultiDocumentWriter) that writes more than
 one document directly into a single Lucene segment, more efficiently
 than the current approach.
 This only affects the creation of an initial segment from added
 documents.  I haven't changed anything after that, eg how segments are
 merged.
 The basic ideas are:
   * Write stored fields and term vectors directly to disk (don't
 use up RAM for these).
   * Gather posting lists  term infos in RAM, but periodically do
 in-RAM merges.  Once RAM is full, flush buffers to disk (and
 merge them later when it's time to make a real segment).
   * Recycle objects/buffers to reduce time/stress in GC.
   * Other various optimizations.
 Some of these changes are similar to how KinoSearch builds a segment.
 But, I haven't made any changes to Lucene's file format nor added
 requirements for a global fields schema.
 So far the only externally visible change is a new method
 setRAMBufferSize in IndexWriter (and setMaxBufferedDocs is
 deprecated) so that it flushes according to RAM usage and not a fixed
 number documents added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2007-06-15 Thread Jason van Zyl
To whom it may engage...

This is an automated request, but not an unsolicited one. For 
more information please visit http://gump.apache.org/nagged.html, 
and/or contact the folk at [EMAIL PROTECTED]

Project lucene-java has an issue affecting its community integration.
This issue affects 3 projects,
 and has been outstanding for 31 runs.
The current state of this project is 'Failed', with reason 'Build Failed'.
For reference only, the following projects are affected by this:
- eyebrowse :  Web-based mail archive browsing
- jakarta-lucene :  Java Based Search Engine
- lucene-java :  Java Based Search Engine


Full details are available at:
http://vmgump.apache.org/gump/public/lucene-java/lucene-java/index.html

That said, some information snippets are provided here.

The following annotations (debug/informational/warning/error messages) were 
provided:
 -DEBUG- Sole output [lucene-core-15062007.jar] identifier set to project name
 -DEBUG- Dependency on javacc exists, no need to add for property javacc.home.
 -INFO- Failed with reason build failed
 -INFO- Failed to extract fallback artifacts from Gump Repository



The following work was performed:
http://vmgump.apache.org/gump/public/lucene-java/lucene-java/gump_work/build_lucene-java_lucene-java.html
Work Name: build_lucene-java_lucene-java (Type: Build)
Work ended in a state of : Failed
Elapsed: 59 secs
Command Line: /opt/jdk1.5/bin/java -Djava.awt.headless=true 
-Xbootclasspath/p:/usr/local/gump/public/workspace/xml-commons/java/external/build/xml-apis.jar:/usr/local/gump/public/workspace/xml-xerces2/build/xercesImpl.jar
 org.apache.tools.ant.Main -Dgump.merge=/x1/gump/public/gump/work/merge.xml 
-Dbuild.sysclasspath=only -Dversion=15062007 
-Djavacc.home=/usr/local/gump/packages/javacc-3.1 package 
[Working Directory: /usr/local/gump/public/workspace/lucene-java]
CLASSPATH: 

[EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2007-06-15 Thread Jason van Zyl
To whom it may engage...

This is an automated request, but not an unsolicited one. For 
more information please visit http://gump.apache.org/nagged.html, 
and/or contact the folk at [EMAIL PROTECTED]

Project lucene-java has an issue affecting its community integration.
This issue affects 3 projects,
 and has been outstanding for 31 runs.
The current state of this project is 'Failed', with reason 'Build Failed'.
For reference only, the following projects are affected by this:
- eyebrowse :  Web-based mail archive browsing
- jakarta-lucene :  Java Based Search Engine
- lucene-java :  Java Based Search Engine


Full details are available at:
http://vmgump.apache.org/gump/public/lucene-java/lucene-java/index.html

That said, some information snippets are provided here.

The following annotations (debug/informational/warning/error messages) were 
provided:
 -DEBUG- Sole output [lucene-core-15062007.jar] identifier set to project name
 -DEBUG- Dependency on javacc exists, no need to add for property javacc.home.
 -INFO- Failed with reason build failed
 -INFO- Failed to extract fallback artifacts from Gump Repository



The following work was performed:
http://vmgump.apache.org/gump/public/lucene-java/lucene-java/gump_work/build_lucene-java_lucene-java.html
Work Name: build_lucene-java_lucene-java (Type: Build)
Work ended in a state of : Failed
Elapsed: 59 secs
Command Line: /opt/jdk1.5/bin/java -Djava.awt.headless=true 
-Xbootclasspath/p:/usr/local/gump/public/workspace/xml-commons/java/external/build/xml-apis.jar:/usr/local/gump/public/workspace/xml-xerces2/build/xercesImpl.jar
 org.apache.tools.ant.Main -Dgump.merge=/x1/gump/public/gump/work/merge.xml 
-Dbuild.sysclasspath=only -Dversion=15062007 
-Djavacc.home=/usr/local/gump/packages/javacc-3.1 package 
[Working Directory: /usr/local/gump/public/workspace/lucene-java]
CLASSPATH: 

[jira] Resolved: (LUCENE-925) Analysis Package Level Javadocs

2007-06-15 Thread Doron Cohen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen resolved LUCENE-925.


Resolution: Fixed

Committed to branch 2.2 and trunk  (btw removed few differences between trunk 
and branch 2.2 for this file).

 Analysis Package Level Javadocs
 ---

 Key: LUCENE-925
 URL: https://issues.apache.org/jira/browse/LUCENE-925
 Project: Lucene - Java
  Issue Type: Wish
  Components: Javadocs
Affects Versions: 2.2
Reporter: Grant Ingersoll
Assignee: Doron Cohen
Priority: Minor
 Fix For: 2.2

 Attachments: LUCENE-925-GSI-v2.patch, LUCENE-925.patch, 
 LUCENE-925.patch, LUCENE-925.patch, LUCENE-925.patch


 Analysis package level javadocs need improving.  An overview of what an 
 Analyzer does, and maybe some sample code showing how to write you own 
 Analyzer, Tokenizer and TokenFilter would be really helpful.  Bonus would be 
 some discussion on best practices for achieving performance during analysis. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-843) improve how IndexWriter uses RAM to buffer added documents

2007-06-15 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505418
 ] 

Michael McCandless commented on LUCENE-843:
---

  When merging segments we don't merge the doc store files when all 
  segments are sharing the same ones (big performance gain),
 
 Is this only in the case where the segments have no deleted docs? 

Right.  Also the segments must be contiguous which the current merge
policy ensures but future merge policies may not.


 improve how IndexWriter uses RAM to buffer added documents
 --

 Key: LUCENE-843
 URL: https://issues.apache.org/jira/browse/LUCENE-843
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.2
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Attachments: LUCENE-843.patch, LUCENE-843.take2.patch, 
 LUCENE-843.take3.patch, LUCENE-843.take4.patch, LUCENE-843.take5.patch, 
 LUCENE-843.take6.patch, LUCENE-843.take7.patch, LUCENE-843.take8.patch


 I'm working on a new class (MultiDocumentWriter) that writes more than
 one document directly into a single Lucene segment, more efficiently
 than the current approach.
 This only affects the creation of an initial segment from added
 documents.  I haven't changed anything after that, eg how segments are
 merged.
 The basic ideas are:
   * Write stored fields and term vectors directly to disk (don't
 use up RAM for these).
   * Gather posting lists  term infos in RAM, but periodically do
 in-RAM merges.  Once RAM is full, flush buffers to disk (and
 merge them later when it's time to make a real segment).
   * Recycle objects/buffers to reduce time/stress in GC.
   * Other various optimizations.
 Some of these changes are similar to how KinoSearch builds a segment.
 But, I haven't made any changes to Lucene's file format nor added
 requirements for a global fields schema.
 So far the only externally visible change is a new method
 setRAMBufferSize in IndexWriter (and setMaxBufferedDocs is
 deprecated) so that it flushes according to RAM usage and not a fixed
 number documents added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2007-06-15 Thread Chris Hostetter


: I just got a clean copy of the lucene trunk and I can build the
: project without any problems. I first guessed that the 3rd party lib
: could not be downloaded during the build process but according to
: 
http://vmgump.apache.org/gump/public/lucene-java/lucene-java/gump_work/build_lucene-java_lucene-java.html
: this seems to be rather a classpath problem?!
:
: Does this happen every day?!

it's sporadic, but the frequency definitely seems to have increased a bit
lately.  I recently tried to email the gump folks about a related problem
with contrib/bdb and got no response (but searching for it in the archives
now i see that someone did respond, just not to me or the lucene list...

http://www.nabble.com/-GUMP%40vmgump-%3A-Project-lucene-java-%28in-module-lucene-java%29-failed-tf3860340.html#a10941228
http://www.nabble.com/Re%3A--GUMP%40vmgump-%3A-Project-lucene-java-%28in-module-lucene-java%29-failed-tf3862915.html#a10943423

...it's not clear to me how gump works, or how it builds things, but it
seems evident from that reply that it doesn't really care what
your build.xml says about how to compile the code, it has it's own way of
dealing with dependencies.

If anyone on this list cares about gump, perhaps they would be interested
in subscribing to the gump mailing list and working with the gump
community to make gump play nicer with our build system -- by all means
feel free.



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-622) Provide More of Lucene For Maven

2007-06-15 Thread Michael Busch (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch resolved LUCENE-622.
--

Resolution: Fixed

Committed to trunk  2.2 branch with a little change:
I removed the sleepycat je 1.7 dependency from the 
pom.xml of lucene-bdb.

Phew!

 Provide More of Lucene For Maven
 

 Key: LUCENE-622
 URL: https://issues.apache.org/jira/browse/LUCENE-622
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Stephen Duncan Jr
Assignee: Michael Busch
 Fix For: 2.2

 Attachments: lucene-622.txt, LUCENE-622_NEW.patch, lucene-core.pom, 
 lucene-highlighter-2.0.0.pom, lucene-maven.patch, lucene-maven.tar.bz2, 
 test-project.tar.gz


 Please provide javadoc  source jars for lucene-core.  Also, please provide 
 the rest of lucene (the jars inside of contrib in the download bundle) if 
 possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: PLEASE READ: 2.2 branch created, feature freeze in effect

2007-06-15 Thread Michael Busch

Michael Busch wrote:

- *Only* Jira issues with Fix version 2.2 and priority Blocker will
 delay a release candidate build. If on June 17th none of those issue are
 in Jira I will build a release candidate and call a release vote on
 java-dev.

Hi Team,

it looks good with our schedule! There are only 4 issues in Jira left, 
all of

them are javadoc improvements. Thanks for all the great documentation
patches!
Tomorrow (Saturday) is the last day before the release candidate build.
So if anyone wants to commit more documentation patches please do so
by end of tomorrow.

- Michael

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]