Upayavira, ever did this?

Ha, look at my email from 20 days ago and this:
https://github.com/javanna/elasticshell

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Wed, Feb 6, 2013 at 2:38 PM, Otis Gospodnetic <otis.gospodne...@gmail.com
> wrote:

> Btw wouldn't this be a chance to create a solr cli tool, much like
> es2unix?  Maybe with a shell? I'm off-line now, but I recently came across
> a java lib that makes this easy... jclam jsomething ...
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Feb 6, 2013 8:48 AM, "Jan Høydahl" <jan....@cominvent.com> wrote:
>
>> With dependencies I meant external jar dependencies. Perhaps extensions
>> could have deps while leaving the "core" compilable without?
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> 5. feb. 2013 kl. 17:10 skrev Upayavira <u...@odoko.co.uk>:
>>
>> > By dependencies, do you mean other java classes? I was thinking of
>> > splitting it out into a few classes, each of which is clearer in its
>> > purpose.
>> >
>> > Upayavira
>> >
>> > On Tue, Feb 5, 2013, at 02:26 PM, Jan Høydahl wrote:
>> >> Wiki page exists already: http://wiki.apache.org/solr/post.jar
>> >>
>> >> I'm happy to consider a refactoring, especially if it make it SIMPLER
>> to
>> >> read and interact with and doesn't add a ton of mandatory dependencies.
>> >> It should probably still be possible to say something like
>> >>
>> >>  javac org/apache/solr/util/SimplePostTool.java
>> >>  java -cp . org.apache.solr.util.SimplePostTool -h
>> >>
>> >> That's just how I've been thinking so far though. If other committers
>> are
>> >> happy with abandoning the simple-ness and instead create a
>> best-practices
>> >> based feature-rich tool with dependencies, then I'll not object.
>> >>
>> >> --
>> >> Jan Høydahl, search solution architect
>> >> Cominvent AS - www.cominvent.com
>> >> Solr Training - www.solrtraining.com
>> >>
>> >> 5. feb. 2013 kl. 05:22 skrev Upayavira <u...@odoko.co.uk>:
>> >>
>> >>> Thx Jan,
>> >>>
>> >>> All I know is I've got a data set of 500k documents, Solr formatted,
>> and
>> >>> I want it to be as easy as possible to get them into Solr. I also want
>> >>> to be able to show the benefit of multithreading. The outcome would
>> >>> really be "make sure your code uses multiple threads to push to Solr"
>> >>> rather than "use post.jar in production". I see post.jar as a
>> >>> demonstration tool, rather than anything else, and am considering
>> adding
>> >>> another feature to enhance that.
>> >>>
>> >>> However, I did stall once I started looking at the SimplePostTool.jar
>> >>> class, because it is loosing its connection with the term 'Simple'.
>> >>> Adding multithreading, however useful, correct, whatever, would
>> >>> completely push it over the edge. Thus, I think the proper approach is
>> >>> to refactor the tool into a number of classes, and only then think
>> about
>> >>> adding multithreading as a completely separate affair. I'm more than
>> >>> happy to have a go at that refactoring, especially if you're prepared
>> to
>> >>> review it.
>> >>>
>> >>> I guess the other thing that is much needed is a wiki page that
>> details
>> >>> the features of the tool, and also explains that its role is
>> >>> educational, rather than anything else.
>> >>>
>> >>> Upayavira
>> >>>
>> >>> On Mon, Feb 4, 2013, at 09:10 PM, Jan Høydahl wrote:
>> >>>> Hi,
>> >>>>
>> >>>> Hmm, the tool is getting bloated for a one-class no-deps tool
>> already :)
>> >>>> Guess it would be useful too with real-life code examples using
>> SolrJ and
>> >>>> other libs as well (such as robots.txt lib, commons-cli etc), but
>> whether
>> >>>> that should be an extension of SimplePostTool or a totally new tool
>> from
>> >>>> scratch is something to discuss. Please bring on your ideas of how
>> you
>> >>>> plan to extend it, perhaps even simplifying the code in the process?
>> >>>>
>> >>>> --
>> >>>> Jan Høydahl, search solution architect
>> >>>> Cominvent AS - www.cominvent.com
>> >>>> Solr Training - www.solrtraining.com
>> >>>>
>> >>>> 3. feb. 2013 kl. 17:19 skrev Upayavira <u...@odoko.co.uk>:
>> >>>>
>> >>>>> I have a scenario in which I need to post 500,000 documents to Solr
>> as a
>> >>>>> test. I have these documents in XML files already formatted in
>> Solr's
>> >>>>> xml format.
>> >>>>>
>> >>>>> Posting to Solr using post.jar it takes 1m55s. With a bit of bash
>> >>>>> jiggery-pokery, I was able to get this down to 1m08s by running four
>> >>>>> concurrent post.jar instances, which strikes me as a significant
>> >>>>> improvement.
>> >>>>>
>> >>>>> I'm considering adding multithreaded capabilities to post.jar, but
>> >>>>> before I go to that effort, I wanted to see if anyone else would
>> >>>>> consider it a useful feature. Given that the SimplePostTool is
>> becoming
>> >>>>> far from simple, I wanted to see whether the feature is likely to be
>> >>>>> accepted before I put in the effort. Also, I would need to consider
>> >>>>> which parts of the tool to add that to. Currently I only want it for
>> >>>>> posting XML docs, but there's also crawling capabilities in it too.
>> >>>>>
>> >>>>> Thoughts?
>> >>>>>
>> >>>>> Upayavira
>> >>>>
>> >>
>>
>>

Reply via email to