RE: The Old Git Discussion

2014-01-03 Thread karl.wright
It also doesn't deal with a major difference between git and svn - in svn, directories are first-class objects, and in git they aren't (they are created as needed). So when you try using gitsvn you almost always wind up with directories you want to remove but can't. Karl From: ext Michael Del

RE: The Old Git Discussion

2014-01-03 Thread karl.wright
As an interested party, and deeply involved in another related Apache project, I have to say that there is a huge benefit for all Apache projects to use common source control. If we were starting over, or if svn was going to die forever, it might be a different story - but given that svn is ali

Solr best practices for search components vis-a-vis sharding

2013-11-20 Thread karl.wright
Hi folks, Maybe this is documented somewhere, and someone can point me at it. For the ManifoldCF Solr plugins, we supply a SearchComponent, which wraps the supplied query in order to perform authorization restrictions on returned documents. The component only fires if the SHARDS parameter is

RE: [VOTE] Lucene / Solr 4.6.0"

2013-11-14 Thread karl.wright
Congratulations, Uwe! Karl Sent from my Windows Phone From: ext Koji Sekiguchi Sent: 11/14/2013 6:35 PM To: dev@lucene.apache.org Subject: Re: [VOTE] Lucene / Solr 4.6.0" Congrats Uwe! :) koji (13/11/15 5:11), Uwe Schindler wrote: > The PMC Chair is going to mar

RE: FW: Is there a really performant way to store a full 32-bit int in doc values?

2013-10-08 Thread karl.wright
. That is both the x & y into the same byte[] chunk. I've done this for a Solr integration in https://issues.apache.org/jira/browse/SOLR-5170 ~ David karl.wright-2 wrote > Hi All (and especially Robert), > > Lucene NumericDocValues seems to operate slower than we would ex

FW: Is there a really performant way to store a full 32-bit int in doc values?

2013-10-08 Thread karl.wright
Hi All (and especially Robert), Lucene NumericDocValues seems to operate slower than we would expect. In our application, we're using it for storing coordinate values, which we retrieve to compute a distance. While doing timings trying to determine the impact of including a sqrt in the calcul

RE: Lucene tests killed one other SSD - Policeman Jenkins

2013-08-19 Thread karl.wright
Right, that's what I said. And one write means writing the *whole* disk. So Mike and I may *both* be right. ;-) Karl -Original Message- From: ext Uwe Schindler [mailto:u...@thetaphi.de] Sent: Monday, August 19, 2013 1:07 PM To: dev@lucene.apache.org Subject: RE: Lucene tests killed on

RE: Lucene tests killed one other SSD - Policeman Jenkins

2013-08-19 Thread karl.wright
Mike, I'm talking about a 1TB SSD option for some hardware we are buying. If you are really curious, I can ask the people who are doing the project for the model and specs. Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Monday, August 19,

RE: Lucene tests killed one other SSD - Policeman Jenkins

2013-08-19 Thread karl.wright
" Only 70 full writes seems a little bit low for an SSD." That's what I thought. I was astounded to learn that that is in fact correct (at least for some of the drives we are using here). Automatic recovery is how the SSD copes with this failure rate. But it is entirely possible that the caus

RE: Lucene tests killed one other SSD - Policeman Jenkins

2013-08-19 Thread karl.wright
I am told that SSD's are spec'd for only 70 full writes before they get an error. The error block is set aside but eventually something critical gets hit. So you should probably should expect this to happen again. Karl -Original Message- From: ext Uwe Schindler [mailto:u...@thetaphi.d

RE: Solrj/Tika question about content types

2013-02-13 Thread karl.wright
Wow, Hoss, this post was so long ago I barely remember writing it. ;-) The problem we were having is not that the content type is not set in SolrJ - it's that SolrCell does not discover it as it did when we used multipart posts and ran with Solr 3.6. We still aren't sure where the change is tha

RE: Solrj/Tika question about content types

2013-01-17 Thread karl.wright
A quick update - it appears that cURL is providing a Content-Type header in the content part of its multipart post, and is using the file extension to come up with "text/plain". Changing the file name causes cURL to change this content-type to "application/octet-stream". But the questions stil

Solrj/Tika question about content types

2013-01-17 Thread karl.wright
Hi all, I'm researching the ticket CONNECTORS-513. In this ticket we seem to have different behavior between Solr 3.x and Solr 4.x as far as Tika content extraction is concerned. The differences seem to be related to the content type that is posted to Solr, and can be demonstrated with cURL.

RE: Is there documentation anywhere describing interoperability of SolrJ?

2012-12-28 Thread karl.wright
Thanks for the reply. The ticket in question is CONNECTORS-594, if you would like to just comment there. Karl Sent from my Windows Phone From: ext Ryan McKinley Sent: 12/28/2012 4:03 PM To: solr-...@lucene.apache.org Subject: Re: Is there documentation anywhere

Is there documentation anywhere describing interoperability of SolrJ?

2012-12-28 Thread karl.wright
Hi all, For the ManifoldCF project, we have an output connector for Solr, and we'd like to port it to use SolrJ instead of homegrown code. However, I cannot find any mention anywhere of whether anyone has tried to maintain compatibility between later versions of SolrJ (e.g. 4.0.0) and previous

RE: Spatial4j dependency in lucene 4.0.0, final

2012-11-15 Thread karl.wright
Hi David, We found the version in the grandparent pom, so that's ok. The build issue against 0.2 was due to other changes in Lucene 4.0.0-BETA vs. Lucene 4.0.0. I am willing to assist to some extent with spatial4j, if that is yours. It changed significantly from 0.2 to 0.3, and not just in th

Spatial4j dependency in lucene 4.0.0, final

2012-11-15 Thread karl.wright
Hi guys, The 4.0.0 lucene-spatial maven dependency on spatial4j is UNVERSIONED. But the two spatial4j versions in play (0.2 and 0.3) are significantly different. We have code developed for lucene-spatial 4.0.0 beta which doesn't seem to compile with either spatial4j version. What was the int

RE: Lucene 4.0 memory usage during indexing - is this expected?

2012-10-03 Thread karl.wright
Mystery resolved; the problem was due to an ever-increasing record size, which was in turn due to a record structure that was never being cleared. This caused it to appear as if the total allocation of structures used for analysis was steadily growing. But the number of such entities did NOT g

RE: Lucene 4.0 memory usage during indexing - is this expected?

2012-10-03 Thread karl.wright
Threads are managed via an executor service and are a fixed size thread pool, of size 16 on this machine. There are not a lot of fields in the schema (a half dozen). We do use PerFieldAnalyzerWrapper. I'm still grappling with the mat reports; it's possible of course that we're holding onto so

RE: Lucene 4.0 memory usage during indexing - is this expected?

2012-10-03 Thread karl.wright
There's a fixed-sized thread pool involved in doing the indexing, of a size that depends on the machine parameters. Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, October 03, 2012 10:43 AM To: Wright Karl (Nokia-LC/Boston) Subject

RE: Solr posting question

2012-07-14 Thread karl.wright
I'm sorry the info has been dribbling in slowly; it's all now summarized in CONNECTORS-491. Now that I've confirmed that this even occurs for them without the ";" (unlike what I was originally told) it is clear it is a config related issue. I have urged them to look to this list for further he

RE: Solr posting question

2012-07-13 Thread karl.wright
Hoss, Here are the details: (1) The actual metadata posted is a string of the form "12345;#string". There is only be one value posted for the metadata field, but Solr complains that we're trying to apply multiple values to a single-valued field and does not index the document, unless the ";"

RE: Solr posting question

2012-07-12 Thread karl.wright
I'll need to ask the reporter for more details since it appears the answer is not simple. It may even be an app server issue. Thanks Karl Sent from my Windows Phone -Original Message- From: ext Chris Hostetter Sent: 7/12/2012 8:29 PM To: dev@lucene.apache.org Subject: Re: Solr posting

Solr posting question

2012-07-12 Thread karl.wright
Hi all, I received a report of a problem with posting data to Solr. The post method is a multi-part form, so if you inspect it, it looks something like this: >> boundary--- Content-Disposition: form-data; name=metadata_attribute_name Content-Type: text; charset=utf-8 abc;def;ghi ---bou

RE: Lucene 4.0 Beta

2012-01-11 Thread karl.wright
Having some interest in this issue, may I suggest setting a branch date? On the agreed-upon date, a branch is made. After that date, commits go to trunk and (maybe) are pulled up into the 4.0 branch. If the date is oh, say, 1 week away, people can plan accordingly to yield a relatively stable

RE: Solr plugin component resource cleanup?

2012-01-11 Thread karl.wright
Thanks, Erik, this is not ideal but it will work for my purposes. But it seems a shame that the whole SolrCoreAware setup as it was designed turned out to be so problematic. Karl -Original Message- From: ext Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Wednesday, January 11, 201

RE: Solr plugin component resource cleanup?

2012-01-11 Thread karl.wright
"SolrCoreAware" and "CloseHook" are related in that you need a SolrCore object in order to call SolrCore.addCloseHook(). Indeed, the javadoc for the CloseHook interface states that the expected way you are supposed to use this in a plugin is via something like this: public void inform(SolrCore

RE: Solr plugin component resource cleanup?

2012-01-08 Thread karl.wright
I created a ticket for this: SOLR-3015. I hope there's a simple solution and I can just close it, but if not I will experiment and try to produce a patch. Karl From: Wright Karl (Nokia-LC/Boston) Sent: Monday, January 02, 2012 11:02 AM To: dev@lucene.apa

RE: Solr plugin component resource cleanup?

2012-01-02 Thread karl.wright
This works fine for a SearchComponent, but if I try this for a QParserPlugin I get the following: [junit] org.apache.solr.common.SolrException: Invalid 'Aware' object: org.apache.solr.mcf.ManifoldCFQParserPlugin@18941f7 -- org.apache.solr.util.plugin.SolrCoreAware must be an instance of: [

Solr plugin component resource cleanup?

2011-12-20 Thread karl.wright
Is there a preferred time/manner for a Solr component (e.g. a SearchComponent) to clean up resources that have been allocated during the time of its existence, other than via a finalizer? There seems to be nothing for this in the NamedListInitializedPlugin interface, and yet if you allocate a r

RE: [jira] [Issue Comment Edited] (SOLR-1895) ManifoldCF SearchComponent plugin for enforcing ManifoldCF security at search time

2011-09-18 Thread karl.wright
I think your expectation for s-d13 may be incorrect. If you use AD as a model, you are effectively applying share security that has no allow sids but some deny sids. With AD you would not get this doc either. -Original Message - From: ext Koji Sekiguchi (JIRA) Sent: 17/09/2011, 11:49

RE: How to access a Lucene contrib package from a Solr contrib package?

2011-09-16 Thread karl.wright
You’re right – the package moved since this was originally developed. An awful lot of stuff has, in fact, moved. ;-) That made the difference in finding that class – now I’ve got to chase down a few others and I should be set. Karl From: ext Steven A Rowe [mailto:sar...@syr.edu] Sent: Friday,

RE: How to access a Lucene contrib package from a Solr contrib package?

2011-09-16 Thread karl.wright
common.compile-core: [javac] Compiling 1 source file to C:\wip\solr\trunk\solr\build\contrib\solr -auth\classes\java [javac] C:\wip\solr\trunk\solr\contrib\auth\src\java\org\apache\solr\auth\Ma nifoldCFSecurityFilter.java:163: cannot find symbol [javac] symbol : class BooleanFilter

RE: How to access a Lucene contrib package from a Solr contrib package?

2011-09-16 Thread karl.wright
Thanks for the reply! Unfortunately, there must be something more to it. This is what I have: >> Solr Integration with ManifoldCF, for repository document authorization << The lucene-libs directory is not even create

How to access a Lucene contrib package from a Solr contrib package?

2011-09-16 Thread karl.wright
Hi folks, I'm trying to turn SOLR-1895 into a real contrib module but I'm having some trouble with the ant build for it. Specifically, the module needs the lucene contrib jar lucene-queries.jar, but I don't know the right way to indicate that in my new solr/contrib/auth/build.xml file. Does a

Solr updater trunk changes

2011-07-27 Thread karl.wright
Hi folks, I'm trying to update to the latest trunk, and there have been changes to the Solr updater that I don't understand how to use. For instance, the following code: CommitUpdateCommand commit = new CommitUpdateCommand(this.request,optimize); ... now requires an array of IndexReader ob

RE: Related project link to ManifoldCF from Solr site?

2011-06-16 Thread karl.wright
I created a ticket for it - SOLR-2602. I'll attach a patch shortly. Karl -Original Message- From: ext Simon Willnauer [mailto:simon.willna...@googlemail.com] Sent: Thursday, June 16, 2011 2:00 PM To: dev@lucene.apache.org Subject: Re: Related project link to ManifoldCF from Solr site? a

Related project link to ManifoldCF from Solr site?

2011-06-16 Thread karl.wright
Hi folks, How hard would it be to get a link to ManifoldCF from the Solr site's related-link section? I'm seeing a lot of people who know Solr but have no idea ManifoldCF even exists, and I'd like to find some way to correct that problem. Karl

RE: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread karl.wright
Congratulations, Jan! Karl -Original Message- From: ext Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, June 13, 2011 10:43 AM To: dev@lucene.apache.org Subject: Welcome Jan Høydahl as Lucene/Solr committer I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl

RE: Brainstorming on Improving the Release Process

2011-03-30 Thread karl.wright
Hi Grant, This is a great post. I'm not a committer for Lucene or Solr, but I'm seriously thinking that much of what Lucene/Solr does right should be considered by the project I AM a committer for: ManifoldCF. Key things I would add based on experience with commercial software development: (A

RE: [jira] Commented: (SOLR-2026) Need infrastructure support in Solr for requests that perform multiple sequential queries

2011-03-04 Thread karl.wright
All that the patch contributes is the infrastructure needed to allow multiple queries. It's structured so that the results from one query are available to construct the query for the next. The patch does not contribute a multi-query query parser, or means of merging the results into a final re

RE: Scoring woes?

2011-01-26 Thread karl.wright
I took my own suggestion and used the DisjunctionMaxQuery. This solved the problem. Karl From: Wright Karl (Nokia-MS/Boston) Sent: Wednesday, January 26, 2011 6:40 PM To: Wright Karl (Nokia-MS/Boston); 'dev@lucene.apache.org' Cc: 'simon.willna...@gmail.com' Subject: RE: Scoring woes? Interestin

RE: Scoring woes?

2011-01-26 Thread karl.wright
Interesting datapoint: After the reindexing, the following query returns the right results in the right order: (+value_3:Lexington~0.877 +value_1:Massachusetts~0.877 +*:*^0.0 +*:*^0.0 +*:*^0.0) (+value_3:Lexington~0.877 +value_1:Massachusetts~0.877 +value_4:_empty_ +value_5:_empty_ +value_6:_em

Scoring woes?

2011-01-26 Thread karl.wright
I have an interesting scoring problem, which I can't seem to get around. The problem is best stated as follows: (1)My schema has several independent fields, e.g. "value_0", "value_1", ... "value_6". (2)Every document has all of these fields set, with a-priori field norm values. Where

RE: Lucene & Google Summer of Code 2011

2011-01-24 Thread karl.wright
A nice idea. I've always wondered about this, because for me "summer" and "code" do not go together very well. ;-) Karl -Original Message- From: ext Simon Willnauer [mailto:simon.willna...@googlemail.com] Sent: Monday, January 24, 2011 3:30 PM To: dev@lucene.apache.org Subject: Lucene &

RE: Odd Boolean scoring behavior?

2011-01-21 Thread karl.wright
This is a query that wraps another query, which limits the number of results returned from it to some specific number. It seems very helpful for the situation where you have a lot of clauses in a query and each of them is expected to be small, but there is a chance of having one clause return l

RE: Odd Boolean scoring behavior?

2011-01-21 Thread karl.wright
Turns out that I inadvertently reverted one of Simon's changes to CutoffQueryWrapper, which explains the second effect. So all is now well. Thanks for your assistance! Karl From: Wright Karl (Nokia-MS/Boston) Sent: Thursday, January 20, 2011 9:44 PM To:

RE: Odd Boolean scoring behavior?

2011-01-20 Thread karl.wright
Found the cause of the zero querynorms, and fixed it. But the results are still not as I would expect. The first result has language=ger but scores higher than the second result which has language=eng. And yet, my query is boosting like this: Boolean OR Boolean (boost = 100.0) AND (langua

RE: Odd Boolean scoring behavior?

2011-01-20 Thread karl.wright
So I think I understand where the blank values and repeats come from. Those are the expansions of fuzzy queries against fields that have no matches whatsoever for the fuzzy values in question. So those are indeed OK. I guess then that the problem is that the scoring explanation makes no sense.

RE: Odd Boolean scoring behavior?

2011-01-20 Thread karl.wright
The original query is fine, and has the boost as expected: ((+language:eng +( CutoffQueryWrapper((+value_0:bunker~0.8332333 +value_0:hill)^0.667) CutoffQueryWrapper((+othervalue_0:bunker~0.8332333 +value_0:hill)^0.5714286) CutoffQueryWrapper((+value_0:bunker~0.8332333 +otherval

RE: Odd Boolean scoring behavior?

2011-01-20 Thread karl.wright
I tried commenting out the final OR term, and that excluded all records that were out-of-language as expected. It's just the boost that doesn't seem to work. Exploring the explain is challenging because of its size, but there are NO boosts recorded of the size I am using (10.0). Here's the ba

RE: Query parser contract changes?

2011-01-18 Thread karl.wright
This turns out to have indeed been due to a recent, but un-announced, index format change. A rebuilt index worked properly. Thanks! Karl From: ext karl.wri...@nokia.com [karl.wri...@nokia.com] Sent: Monday, January 17, 2011 10:53 AM To: dev@lucene.apache

RE: Query parser contract changes?

2011-01-17 Thread karl.wright
Another data point: the standard query parser actually ALSO fails when you do anything other than a *:* query. When you specify a field name, it returns zero results: root@duck93:/data/solr-dym/solr-dym# curl "http://localhost:8983/solr/nose/standard?q=value_0:a*"; 07value_0:a* But: root@

Query parser contract changes?

2011-01-17 Thread karl.wright
Hi folks, I'm sorely puzzled by the fact that my QParser implementation ceased to work after the latest Solr/Lucene trunk update. My previous update was about ten days ago, right after Mike made his index changes. The symptom is that, although the query parser is correctly called, and seems t

RE: LICENSE/NOTICE file contents

2011-01-10 Thread karl.wright
Everyone should (carefully) read the Apache License 2.0 section 4(d). It turns out that Apache has a somewhat unusual definition for the term "derivative work". It has to be something you actually modified, not just include. So the incubator approach seems correct; neither the HSQLDB notice n

RE: LICENSE/NOTICE file contents

2011-01-08 Thread karl.wright
>> Nope - wasn't me that added the license stuff into NOTICE.txt ;-) But, including Jetty's NOTICE seems appropriate for our NOTICE. It's just the license parts of the HSQLDB and SLF4J that should be moved to LICENSE.txt << The NOTICE text is actually different from the LICENSE text for

RE: LICENSE/NOTICE file contents

2011-01-08 Thread karl.wright
>From svn, Yonik seems to be the go-to guy for LICENSE and NOTICE stuff. >Yonik, do you remember why the HSQLDB and Jetty notice text was included in >Solr's NOTICE.txt? The incubator won't release ManifoldCF until we answer >this question. ;-) Karl F

LICENSE/NOTICE file contents

2011-01-08 Thread karl.wright
This list might be interested to know that the current Solr LICENSE and NOTICE file contents are not Apache standard. The ManifoldCF project based its LICENSE and NOTICE files on the Solr ones and got the following icy reception in the incubator: >> The NOTICE file is still incorrect and i

Stemming using automata

2010-11-17 Thread karl.wright
Folks, I had an interesting conversation with Simon a few weeks back. It occurred to me that it might be possible to build an automata that handles stemming and pluralization on searches. Just a thought... Karl

RE: svn commit: r1032995 - in /lucene/dev/trunk/solr/src/site/src/documentation/content/xdocs/images: solr.jpg solr_FC.eps

2010-11-09 Thread karl.wright
Is this something ManifoldCF needs to do also? Karl -Original Message- From: ext Grant Ingersoll [mailto:gsing...@apache.org] Sent: Tuesday, November 09, 2010 3:34 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1032995 - in /lucene/dev/trunk/solr/src/site/src/documentation/conten

RE: Compilation errors

2010-11-05 Thread karl.wright
Never mind - this was due to a local change in my work area. Karl _ From: Wright Karl (Nokia-MS/Boston) Sent: Friday, November 05, 2010 3:51 PM To: 'dev@lucene.apache.org' Subject: Compilation errors Solr trunk seems to have compilation errors: [j

Compilation errors

2010-11-05 Thread karl.wright
Solr trunk seems to have compilation errors: [javac] C:\wip\solr-dym\lucene_solr_trunk\solr\src\java\org\apache\solr\handler\component\ResponseBuilder.java:124: cannot find symbol [javac] symbol : variable debug [javac] location: class org.apache.solr.handler.component.ResponseBuild

RE: inconsistency/performance trap of empty terms

2010-10-28 Thread karl.wright
In database queries, it is often useful to treat an empty value specially, and be able to search explicitly for records that have (for instance) no field X, or no value for field X. I can't regurgitate offhand all the precise situations that I've used this and claim that they would apply to a s

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Glad to be of service. ;-) Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, October 28, 2010 11:48 AM To: dev@lucene.apache.org; simon.willna...@gmail.com Subject: Re: ArrayIndexOutOfBounds exception using FieldCache On Thu, Oct 28,

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
The internet is not the bottleneck ;-). It's the intranet here. Index is 14GB. Besides, it looks like Yonik found the problem. Karl -Original Message- From: ext Walter Underwood [mailto:wun...@wunderwood.org] Sent: Thursday, October 28, 2010 11:00 AM To: dev@lucene.apache.org Subject:

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Yep, that fixed it. ;-) Everything seems happy now. Karl -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of ext Yonik Seeley Sent: Thursday, October 28, 2010 10:17 AM To: dev@lucene.apache.org Subject: Re: ArrayIndexOutOfBounds exception using FieldCache On

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Talked with IT here - they don't recommend external transfers of this size. So I think we'd best try the "instrument and repeat" approach instead." Karl -Original Message- From: ext karl.wri...@nokia.com [mailto:karl.wri...@nokia.com] Sent: Thursday, October 28, 2010 8:16 AM To: dev@lu

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
It's on an internal Nokia machine, unfortunately, so the only way I can transfer it out is with my credentials, or by email, which is definitely not going to work ;-). But if you can provide me with an account on a machine I'd be transferring it to, I may be able to scp it from here. Karl -

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Not good indeed. Synched to trunk, blew away old indexes, reindexed, same behavior. So I think we've got a problem, Houston. ;-) Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, October 27, 2010 11:08 AM To: dev@lucene.apache.org

ArrayIndexOutOfBounds exception using FieldCache

2010-10-27 Thread karl.wright
Hi Folks, I just tried to index a data set that was probably 2x as large as the previous one I'd been using with the same code. The indexing completed fine, although it was slower than I would have liked. ;-) But the following problem occurs when I try to use FieldCache to look up an indexed

RE: FW: Solr and LCF security at query time

2010-10-04 Thread karl.wright
Is there Kerberos support in this offering? That's what's missing. LDAP support is actually built into java, and the Active Directory authority makes use of it. So all we need is the authentication piece. Karl From: ext Lance Norskog [goks...@gmail.co

RE: discussion about release frequency.

2010-09-20 Thread karl.wright
“but again, i have serious questions about maven in general.” Maybe you just need to drink the Maven Koolaid. Unless they have something stronger… ;-) Karl From: ext Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, September 20, 2010 1:08 PM To: dev@lucene.apache.org Subject: Re: discussio

RE: discussion about release frequency.

2010-09-20 Thread karl.wright
My 2c... Maven is pretty much incompatible with some of the standards of release engineering, namely repeatable builds. It tries to do pretty much the same job that apt does under debian and ubuntu and is therefore not terribly useful in that environment. All the mavenistas I've talked to hav

RE: Now, a lost data problem with trunk too

2010-09-14 Thread karl.wright
Yes. Of course. My oversight. So I did the obvious thing and searched for the value field directly, and it is there: POI|DEU:205:20187477:1014564|brandenburger torger52.3993513.04793brandenburger tor, potsdam, deutschland So, something about the way I am searching for it is not right. Looki

Now, a lost data problem with trunk too

2010-09-14 Thread karl.wright
Hi folks, It looks like the handle leak may be real - Simon Willnauer has been looking at it and could not find an explanation for the behavior I have been seeing. But before we got too far on that problem, I encountered what appears to be an even more serious problem. Specifically, I'm losin

RE: Trunk file handle leak?

2010-09-10 Thread karl.wright
Hi Yonik, Be that as it may, I'm seeing a steady increase in file handles used by that process over an extended period of time (now 20+ minutes): r...@duck6:~# lsof -p 22379 | wc 7867714 108339 r...@duck6:~# lsof -p 22379 | wc 7877723 108469 r...@duck6:~# lsof -p 22379 | wc

RE: Trunk file handle leak?

2010-09-10 Thread karl.wright
Hi Simon, (1) There are periodic commits, every 10,000 records. (2) I have no searcher/reader open at the same time, that I am aware of. This is a straight indexing task. (You ought to know, you wrote some of the code!) (3) I *do* see auto warming being called, but it seems not to happen at the

Trunk file handle leak?

2010-09-10 Thread karl.wright
Hi folks, I am running into what appears to be a file handle leak in trunk during indexing. It's not clear yet what the causative event is, although the indexing runs for more than an hour before it occurs. The system is Ubuntu, and has 1024 file handles (per process). This is on a trunk che

RE: Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
Ah, the old "switch that you shouldn't throw" trick. ;-) Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, August 18, 2010 2:39 PM To: dev@lucene.apache.org Subject: Re: Question about string retrieval with FieldCache in trunk OK, th

RE: Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
This field was *not* indexed, just stored. That seems to have been the problem. Not sure why it must be indexed to be retrievable, but clearly it does. Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, August 18, 2010 2:28 PM To:

RE: Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
Thanks, didn't think to look there. Unfortunately I'm still getting back empty strings - and this is for a required field. So something isn't right... Maybe I'll pester Simon. ;-) Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday,

RE: Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
Passing in new BytesRef(), I always get back an empty string. So clearly the comment *is* correct, and neither you nor I know how to do this properly. ;-) Any other suggestions? Karl -Original Message- From: Wright Karl (Nokia-MS/Cambridge) Sent: Wednesday, August 18, 2010 11:28 AM To

RE: Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
If you are correct, the comment is certainly incorrect, since it implies that the SAME BytesRef is returned as you pass in. Karl -Original Message- From: ext Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: Wednesday, August 18, 2010 11:27 AM To: dev@lucene.apache.org Cc: yo...

RE: Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
Exactly. getTerms() returns a DocTerms, which has this: /** The BytesRef argument must not be null; the method * returns the same BytesRef, or an empty (length=0) * BytesRef if the doc did not have this field or was * deleted. */ public abstract BytesRef getTerm(int docID

Question about string retrieval with FieldCache in trunk

2010-08-18 Thread karl.wright
Hi folks, What is the proper way to retrieve a string field from lucene in trunk? I'm specifically looking for the equivalent of: String[] fieldValues = FieldCache.DEFAULT.getStrings(reader,fieldName); String actualValue = fieldValues[luceneID-docBase]; The getStrings() method seems to have go

RE: Index corrupted after delta import using DIH

2010-08-17 Thread karl.wright
There is no patch because the fixes were committed directly to trunk. Also, the corrupted indexes will need to be rebuilt. You do not, however, need to "delete and reinstall" everything, just the Lucene jars. Karl -Original Message- From: ext Pradeep Pujari [mailto:prade...@rocketmail

RE: Index corrupted after delta import using DIH

2010-08-16 Thread karl.wright
Yes, this was addressed by a set of fixes from Mike McCandless back in early August. Karl From: ext Pradeep Pujari [prade...@rocketmail.com] Sent: Monday, August 16, 2010 5:41 PM To: dev@lucene.apache.org Subject: Index corrupted after delta import using

RE: User rights

2010-08-16 Thread karl.wright
There are some contributions to Solr that address this; you might want to search for JIRA tickets in https://issues.apache.org/jira. LCF has a model of document security you may also find interesting, and contributions have been made which enforce this security within Solr as well. Karl From

FWD: [jira] Updated: (SOLR-2032) Map-viewer demo of SolrSpatia l test data

2010-08-08 Thread karl.wright
Fyi. Karl --- original message --- From: "Schmidt Christopher (Nokia-MS/Cambridge)" Subject: Re: [jira] Updated: (SOLR-2032) Map-viewer demo of SolrSpatial test data Date: August 7, 2010 Time: 9:32:52 AM On Aug 7, 2010, at 2:19 AM, Wright Karl (Nokia-MS/Cambridge) wrote: > Also, Apache will

Setting default query parser via solrconfig.xml

2010-08-04 Thread karl.wright
Folks, Now that I've developed some infrastructure for multiple queries in a single request, I'm running into some downstream issues where design that worked OK on a single-query-per-request basis now makes less sense. One specific case is when each query in the sequence needs to use a differen

RE: What is the best way in Solr to perform iterative search?

2010-08-03 Thread karl.wright
I think a general feature along the lines I discussed would permit *all* of the scenarios to be dealt with effectively. Very good - I'll create a feature, ticket, and patch, and link to your ticket as well. ;-) Thanks, Karl From: ext Koji Sekiguchi [k...

RE: What is the best way in Solr to perform iterative search?

2010-08-03 Thread karl.wright
Oops - I meant the "process()" method, not "prepare()", below. Looking at the code, it occurs to me that ideally such sequences of queries ought to have their own supporting infrastructure within Solr, because doing two queries inside a search component's process() method basically defeats quer

RE: What is the best way in Solr to perform iterative search?

2010-08-03 Thread karl.wright
Thanks - I've got the latter situation, actually. I presume when you wrote your QueryComponent you did the query first execution, modification, and second execution all within the equivalent of the prepare() method? I didn't want to replace QueryComponent itself, since it does a fair bit of st

What is the best way in Solr to perform iterative search?

2010-08-03 Thread karl.wright
Folks, I have a search task which needs to do the following: - Do a search - Use the results of that search to form a final query, which when executed returns the overall results The question is, what is the best way to build components within Solr to handle this request flow? My

RE: busywait hang using extracting update handler on trunk

2010-08-02 Thread karl.wright
The result of this run yields the following: >> C:\wip\solr\trunk\solr\example\solr>"c:\Program Files\Java\jdk1.6.0_19\bin\java" -cp \solr-dym\searcher\lib\lucene-core-4.0-dev.jar -ea:org.apache.lucene... org .apache.lucene.index.CheckIndex data\index Opening index @ data\index Segments fil

RE: busywait hang using extracting update handler on trunk

2010-08-02 Thread karl.wright
Thanks. And, for broader consumption, the AIOOB exception is as follows: >> The AIOOB from my reproducible case is as follows: SEVERE: java.lang.IndexOutOfBoundsException: Index: 124, Size: 5 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(Arra

RE: busywait hang using extracting update handler on trunk

2010-08-02 Thread karl.wright
>> And can you run CheckIndex and post that output, and also the AIOOBE you hit for certain searches? << Where can I find CheckIndex? Karl From: ext Michael McCandless [luc...@mikemccandless.com] Sent: Thursday, July 29, 2010 11:56 AM To: dev@lucen

RE: busywait hang using extracting update handler on trunk

2010-07-29 Thread karl.wright
Unfortunately, this theory goes out the window because I discovered other char 243's in the data prior to the one that apparently corrupts the index. I'd love to work with someone here who has better knowldge of indexreader and indexwriter classes. I could potentially post the data itself but t

RE: busywait hang using extracting update handler on trunk

2010-07-28 Thread karl.wright
One of the characters that causes trouble is unicode character 243. Karl --- original message --- From: "Wright Karl (Nokia-MS/Cambridge)" Subject: RE: busywait hang using extracting update handler on trunk Date: July 28, 2010 Time: 6:0:4 AM It appears that whenever I see a merge failure, I a

RE: busywait hang using extracting update handler on trunk

2010-07-28 Thread karl.wright
It appears that whenever I see a merge failure, I also apparently have a corrupt index (I get arrayindexoutofbounds exceptions when searching for certain things). So that may be the underlying cause of the merge infinite loop. I've blown away the indexes repeatedly and tried to rebuild. I am

  1   2   >