Re: Solr on HDFS in a Hadoop cluster

Charles VALLEE Thu, 08 Jan 2015 06:49:30 -0800

Thanks a lot Otis,

While reading the SolrCloud documentation to understand how SolrCloud 
could run on HDFS, I got confused with leader, replica, "non-replica" 
shards, core, index, and collections.
Once it is specified that one cannot add shards, then that one can add 
replica-only shards, then that last "Shard Splitting" paragraph states 
that something changed starting with Solr 4.3.
But it doesn't states that splitting shards can end in a new non-replica 
shard, in a just added node, thus increasing the amount of storage 
available to the index / collection. It states that "split action 
effectively makes two copies of the data as new shards" instead, which 
tastes a lot like replica style shards.
So does it?
Could there be some sort of tutorial describing how to add available 
storage capacity for index / collection, thus adding a node / shard - core 
that one can send new documents to be indexed? (of course, load-balancing 
would be trigered, so it looks like documents would be added to shards out 
of a set of nodes).
Thanks,

Charles VALLEE
Centre de compétence Big data
EDF – DSP - CSP IT-O
DATACENTER - Expertise en Energie Informatique (EEI)
32 avenue Pablo Picasso
92000 Nanterre

charles.val...@edf.fr
Tél. : + (0) 1 78 66 69 81

Un geste simple pour l'environnement, n'imprimez ce message que si vous en 
avez l'utilité.

De :    otis.gospodne...@gmail.com
A :     solr-user@lucene.apache.org
Date :  06/01/2015 18:55
Objet : Re: Solr on HDFS in a Hadoop cluster

Oh, and https://issues.apache.org/jira/browse/SOLR-6743

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

On Tue, Jan 6, 2015 at 12:52 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi Charles,
>
> See http://search-lucene.com/?q=solr+hdfs and
> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Jan 6, 2015 at 11:02 AM, Charles VALLEE <charles.val...@edf.fr>
> wrote:
>
>> I am considering using *Solr* to extend *Hortonworks Data Platform*
>> capabilities to search.
>>
>> - I found tutorials to index documents into a Solr instance from 
*HDFS*,
>> but I guess this solution would require a Solr cluster distinct to the
>> Hadoop cluster. Is it possible to have a Solr integrated into the 
Hadoop
>> cluster instead? - *With the index stored in HDFS?*
>>
>> - Where would the processing take place (could it be handed down to
>> Hadoop)? Is there a way to garantee a level of service (CPU, RAM) - to
>> integrate with *Yarn*?
>>
>> - What about *SolrCloud*: what does it bring regarding Hadoop based
>> use-cases? Does it stand for a Solr-only cluster?
>>
>> - Well, if that could lead to something working with a roles-based
>> authorization-compliant *Banana*, it would be Christmass again!
>>
>> Thanks a lot for any help!
>>
>> Charles
>>

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.

Re: Solr on HDFS in a Hadoop cluster

Reply via email to