[
https://issues.apache.org/jira/browse/LUCENE-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779641#action_12779641
]
Paul Smith commented on LUCENE-2075:
bq. This cache impl should be able to suppor
[
https://issues.apache.org/jira/browse/LUCENE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761408#action_12761408
]
Paul Smith commented on LUCENE-1935:
thanks Uwe, I thought I would regret as
[
https://issues.apache.org/jira/browse/LUCENE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761395#action_12761395
]
Paul Smith commented on LUCENE-1935:
I shall perhaps regret asking this, but is t
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735242#action_12735242
]
Paul Smith commented on LUCENE-1749:
You know what would be absolute icing on
[
https://issues.apache.org/jira/browse/LUCENE-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730551#action_12730551
]
Paul Smith commented on LUCENE-1741:
An algorithm is nice if there are no spec
[
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650051#action_12650051
]
Paul Smith commented on LUCENE-1342:
yeah, it's definitely a Sun bug, not
[
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Smith updated LUCENE-1342:
---
Attachment: hs_err_pid27882.log
hs_err_pid21301.log
2 crash dumps attached
[
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648768#action_12648768
]
Paul Smith commented on LUCENE-1342:
java version "1.6.0_10"
Java(
[
https://issues.apache.org/jira/browse/LUCENE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628513#action_12628513
]
Paul Smith commented on LUCENE-1372:
bq. I'm not following this argument. W
[
https://issues.apache.org/jira/browse/LUCENE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628480#action_12628480
]
Paul Smith commented on LUCENE-1372:
Having a Document sorted last because it
[
https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613208#action_12613208
]
Paul Smith commented on LUCENE-1282:
Can anyone comment as to whether the JRE 1.
[
https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596964#action_12596964
]
Paul Smith commented on LUCENE-1282:
Throwing up an idea here for consideration.
[
https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595946#action_12595946
]
Paul Smith commented on LUCENE-1282:
Another workaround might be to use
ppeared to build and test ok. I'm
happy to pitch in here.
cheers,
Paul Smith
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515882
]
Paul Smith commented on LUCENE-966:
---
We did pretty much the same thing here at Aconex, The tokenization
On 19/06/2007, at 9:58 AM, Michael Busch wrote:
Paul Smith wrote:
Any chance of adding source jars as artifacts too? Makes the
Maven Eclipse plugin rather nice. I appreciate the effort in
organizing the artifacts (particularly the older versions).
cheers,
Paul
In German we have
*sigh*, with attachment this time:
lucene_pom.2.patch
Description: Binary data
On 19/06/2007, at 11:42 AM, Paul Smith wrote:Enhanced version of previous patch. Now compiles and executes all unit tests (although some of them are failing for me)mvn -f lucene-core.pom.xml testyou can still do a
Enhanced version of previous patch. Now compiles and executes all
unit tests (although some of them are failing for me)
mvn -f lucene-core.pom.xml test
you can still do a package (including source distro) and skip the tests
mvn -f lucene-core-pom.xml -Dmaven.test.skip=true package
assem
lucene_pom.patch
Description: Binary data
Attached is a quick patch for the lucene-core pom so that it does compile and package successfully:mvn -f lucene-core.pom.xml packageEnds up with a binary jar in the target/ sub-foldermvn assembly:assemblyCreates a source distribution in the target folde
h/.m2/repository/org/apache/lucene/lucene-parent/@version@/
[EMAIL PROTECTED]@.pom
Am I missing something ?
Paul
On 19/06/2007, at 10:15 AM, Michael Busch wrote:
Paul Smith wrote:
I might try and grab the trunk and see if I can work out what's
needed to do that..
Paul
That'
I'm just kidding, of course! I'll try to take a look at that.
However, making these artifacts was already a lot of work and I'm
not sure how soon I can work on the source artifacts.
I might try and grab the trunk and see if I can work out what's
needed to do that..
Paul
--
On 19/06/2007, at 6:14 AM, Michael Busch wrote:
Hello,
looking at JIRA and the email archives I find several people asking
us to upload Lucene to the Maven2 repository. Currently there are
only the artifacts from Lucene core 1.9.1 and 2.0.0 in the
repository. 1.9.1 is even incomplete, as
On 12/06/2007, at 7:07 PM, mark harwood wrote:
Thanks for the pointers Paul.
I just don't think you can 'package' up a distribution that
includes these jars in your distribution.
Clearly the binary distribution need not bundle servlet-api.jar - a
demo.war file is all that is needed.
Howe
On 12/06/2007, at 5:09 PM, markharw00d wrote:
As part of the documentation push I was considering putting
together an updated demo web app which showed a number of things
(indexing, search, highlighting, XML Query templates etc) and was
wondering what that might mean to the build system if
To answer your question, though, I don't see any reason not to make
the changes to make the current process more repeatable.
Yeah, mod'ing the ant process now is going to be simpler to catch the
current problem. Still, I'd check the Gump stuff for Lucene, because
I'd be surprised that wo
want to jump at without careful thought , but might be worth
considering. I used to be anti-maven, but since version 2, and since
Curt Arnold has been setting up the log4j build environment for
maven, I've been quite impressed with it's capability.
cheers,
Paul Smith
On 17/0
A memory saving optimization would be to not load the corresponding
String[] in the string index (as discussed previously), but there is
currently no way to tell the FieldCachethat the strings are unneeded.
The String values are only needed for merging results in a
MultiSearcher.
Yep, which hap
In our application, we have to sync up the index pretty frequently,
the
warm-up of the index is killing it.
Yep, it speeds up the first sort, but at the cost of making all the
others slower (maybe significantly so). That's obviously not ideal
but could make use of sorts in larger index
Now, if we could use integers to represent the sort field values,
which is
typically the case for most applications, maybe we can afford to
have the
sort field values stored in the disk and do disk lookup for each
document
matched? The look up of the sort field value will be as simple as
On 10/04/2007, at 4:18 AM, Doug Cutting wrote:
Paul Smith wrote:
Disadvantages to this approach:
* It's a lot more I/O intensive
I think this would be prohibitive. Queries matching more than a
few hundred documents will take several seconds to sort, since
random disk accesse
this is
controversial. But, if we wish Lucene to go beyond where it is now,
I think we need to start thinking about this particular problem
sooner rather than later.
Happy Easter to all,
Paul Smith
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
[
https://issues.apache.org/jira/browse/LUCENE-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12481324
]
Paul Smith commented on LUCENE-833:
---
You should try Fisheye! It uses Lucene internally.
http://www.cenqua.com
read with
interest! :)
cheers,
Paul Smith
smime.p7s
Description: S/MIME cryptographic signature
Title: Aconex Email Template
On 05/10/2006, at 3:34 PM, Doron Cohen wrote:If I read the JIRA issue right, it look as if this is fixed in Lucene 2.0.1. Is it?If so, where can I download 2.0.1? No 2.0.1 was released (yet).This issue is fixed in the "svn head".Nightly builds that include this (and oth
end of the stream for tokenization point of view.
I would love to get rid of it, but I think it will break a lot of
behaviour.
cheers,
Paul Smith
On 04/10/2006, at 11:48 AM, George Aroush wrote:
Hi folks,
Over at Lucene.Net, we are trying to determine if it's safe to do the
foll
[
http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436443 ]
Paul Smith commented on LUCENE-675:
---
>From a strict performance point of view, a standard set of important, but
>don't forget other language
[
http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436437 ]
Paul Smith commented on LUCENE-675:
---
If you're looking for freely available text in bulk, what about:
http://www.gutenberg.org/wiki/Main_Page
>
,
Paul Smith
smime.p7s
Description: S/MIME cryptographic signature
[
http://issues.apache.org/jira/browse/LUCENE-388?page=comments#action_12427975 ]
Paul Smith commented on LUCENE-388:
---
This is where some tracing logging code would be useful. Maybe a YourKit
memory snapshot to see what's going on..
[
http://issues.apache.org/jira/browse/LUCENE-388?page=comments#action_12427818 ]
Paul Smith commented on LUCENE-388:
---
geez, yep definitely don't put this in, my patch was only a 'suggestion' to
highlight how it fixes the ro
nst it.
Before you make any decision, I'd sit down and plan what events
you'll actually want to log and at what level. Good planning will
make the Lucene library very useful. You can then decide how you're
going to log them.
cheers,
Paul Smith
smime.p7s
Description: S/MIME cryptographic signature
No, I'm pretty sure it wouldn't, so long as you don't look at this
code, lest you become "tainted" ... ;-)
Isn't that where the phrase "I have no recollection of that Senator"
comes in handy? :)
Paul
-
To unsubscribe, e-
it's of no use except for academic
study. Pity.
Would that preclude re-implementing the same algorithm in new source
code? I'm not clear on whether that violates the license.
cheers,
Paul Smith
-
To unsu
looks a bit b0rk3n to me as well.
Maybe some text being displayed isn't being escaped properly causing
HTML mayhem?
Paul Smith
On 27/01/2006, at 8:12 AM, Yonik Seeley wrote:
I've been getting bad HTML out of JIRA lately:
http://issues.apache.org/jira/browse/LUCENE
Anyone els
On 03/01/2006, at 11:08 AM, markharw00d wrote:
I thought
you said you "didn't really want to have to design a general API for
parsing XML as part of this project" ? :)
Having grown tired of messing with my own solution I tried using
commons Digester with my example XML but ran into iss
Hey all,
I haven't been paying real close attention to this thread, but if any
of you are looking for something that has _easy_ Object->XML->Object
you should seriously try XStream (http://xstream.codehaus.org)..
Simplest/easiest api I've seen. BSD licensed too (Apache friendly).
One c
Most of the CPU time is actually used during the synchronization with multiple threads. I hacked together a version of MemoryLRUCache that used a ConcurrentHashMap from JDK 1.5, and it was another 50% faster ! At a minimum, if the ReadWriteLock class was modified to use the 1.5 facilities some si
[
http://issues.apache.org/jira/browse/LUCENE-467?page=comments#action_12357925 ]
Paul Smith commented on LUCENE-467:
---
If you can create a patch against 1.4.3 there is a reasonable possibility that
I could create a 1.4.3 Lucene+ThisPatch jar and re-index
[
http://issues.apache.org/jira/browse/LUCENE-467?page=comments#action_12357839 ]
Paul Smith commented on LUCENE-467:
---
I probably didn't make my testing framework as clear as I should. Yourkit was
setup to use method sampling (waking up ev
On 17/11/2005, at 10:21 AM, Chris Lamprecht wrote:
1. Run profiler
2. Sort methods by CPU time spent
3. Optimize
4. Repeat
:)
Umm, well I know I could make it quicker, it's just whether it still
_works_ as expected Maintaining the contract means I'll need to
develop some good junit
On 17/11/2005, at 9:24 AM, Doug Cutting wrote:
In general I would not take this sort of profiler output too
literally. If floatToRawIntBits is 5x faster, then you'd expect a
16% improvement from using it, but my guess is you'll see far
less. Still, it's probably worth switching & measuri
I can confirm this takes ~ 20% of an overall Indexing operation (see
attached link from YourKit).
http://people.apache.org/~psmith/luceneYourkit.jpg
Mind you, the whole "signalling via IOException" in the
FastCharStream is a way bigger overhead, although I agree much harder
to f
On 01/10/2005, at 6:30 AM, Erik Hatcher wrote:
On Sep 30, 2005, at 1:26 AM, Paul Smith wrote:
This requirement is almost exactly the same as my requirement for
the log4j project I work on where I wanted to be able to index
every row in a text log file to be it's own Document.
It
e fly
XPath like queries using Lucene which apparently works very well, but
I'm not sure it scales to massive documents such as log files (and
your requirements).
cheers,
Paul Smith
On 30/09/2005, at 3:17 PM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote:
Hi,
My na
On 05/08/2005, at 4:10 AM, Doug Cutting wrote:
Doug Cutting wrote:
Perhaps we need to factor Nutch into two projects, one with NDFS
and MapReduce and the other with the search-specific code. This
falls almost exactly on package lines. The packages
org.apache.nutch.{io,ipc,fs,ndfs,mapre
from
nutch a shared library?
I would love to hear anyones thoughts on the matter.
cheers,
Paul Smith
[1] http://wiki.apache.org/nutch-data/attachments/Presentations/
attachments/oscon05.pdf
[2] http://labs.google.com/papers/mapre
On 16/05/2005, at 5:00 PM, Paul Elschot wrote:
On Monday 16 May 2005 08:24, Paul Smith wrote:
something very odd is going on with my attachments... sorry for the
spam.
It's usually easier open a bug in bugzilla and post the code and
the concerns there. The only disadvantage of bugzilla is
something very odd is going on with my attachments... sorry for the
spam.
On 16/05/2005, at 4:22 PM, Paul Smith wrote:
I'm not even going to say anything this time :-$
On 16/05/2005, at 4:17 PM, Paul Smith wrote:
Silly me, here's the patch with the extra code NOT commented ou
I'm not even going to say anything this time :-$
On 16/05/2005, at 4:17 PM, Paul Smith wrote:
Silly me, here's the patch with the extra code NOT commented out...
Oh my, how embarrassing... :)
Paul
On 16/05/2005, at 4:15 PM, Paul S
Silly me, here's the patch with the extra code NOT commented out...
Oh my, how embarrassing... :)
Paul
On 16/05/2005, at 4:15 PM, Paul Smith wrote:
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e
confuse more people than it helps.I would really appreciate anyones thoughts on this, I'll be very happy to be proven wrong because it will just help me understand more of Lucene. I would hope that speeding up indexing would benefit everyone? Particularly the large scale sites out there.cheers,Paul Smith
IndexWriter.patch
Description: Binary data
e comment on the CPU profile I sent in?
If there was a way of optimizing that loop, then it could mean a
reasonable improvement in indexing speed.
cheers,
Paul Smith
earching inside the content index in this case.
Should it go into the core or in contrib?
+1 to core... (non-binding of course).
Paul Smith
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
63 matches
Mail list logo