I think there might not be enough time.

Bertrand, WDYT?

Critical for project success or an add on ?

Ian


On 2 May 2013 16:34, Ilya Velesevich <ilya.velesev...@gmail.com> wrote:

> Hi Ian,
>
> Many thanks for your reply!
>
> Also one additional clarification about "using DataImportHandler or
> ManifoldCF to provide search for Sling resources using Solr". Could you
> share some thoughts about this task? Or you probably think that this task
> should not be part of GSoC as it seems there could be not enough time to
> implement such support?
>
> Thanks,
> Ilya
>
>
> On Wed, May 1, 2013 at 2:05 AM, Ian Boston <i...@tfd.co.uk> wrote:
>
> > Hi
> > Some comments in line,
> > but please remember to submit this proposal at the GSoC site so that it
> can
> > be reviewed.
> > The deadline is
> >
> > 3rd May 2013
> >
> > Ie this Friday.
> >
> > Ian
> > (More below).
> >
> >
> > On 30 April 2013 19:15, Ilya Velesevich <ilya.velesev...@gmail.com>
> wrote:
> >
> > > Hi Everyone,
> > >
> > > I‘m working on proposal for “Apache Solr backend for Apache Sling” task
> > as
> > > part of Google Summer of Code 2013 –
> > > https://issues.apache.org/jira/browse/SLING-2795. Thus far I was
> reading
> > > articles/watching videos/looking through source code to investigate the
> > > topic in more depth. Now I want to describe my vision on task and
> > > implementation approach. All your comments/suggestions would be very
> > > helpful in order to improve my proposal and bring more value of
> > > implementing the task.
> > >
> > > I see several parts of the task.
> > >
> > > *1.       **Provide CRUDL operations for Solr data through Sling API.*
> > >
> > > This will allow creating Sling resources residing in Solr server and
> > > querying them through Sling API using Solr search capabilities. Solr
> > query
> > > syntax should be used for queries.
> > >
> > > From Sling API perspective custom *ResourceProvider *(and *Resource*)
> > > implementation will be created additionally implementing *
> > > QueriableResourceProvider* and *ModifyingResourceProvider*. (If
> > necessary *
> > > RefreshableResourceProvider* and *DynamicResourceProvider* interfaces
> > will
> > > also be implemented). To communicate with Solr server Solrj API will be
> > > used.
> > >
> >
> >
> > yes (and you might want to think about runing Solr embedded for dev
> > purposes).
> >
> >
> > >
> > > *2.       **Provide convenient ways to create Solr resources based on
> > > different data.*
> > >
> > > *2.1.    **Create Solr resource based on arbitrary Sling resource*.
> This
> > > will allow adding Sling resources to Solr server for efficient search.
> > The
> > > created Solr resource will also hold a reference (most likely, resource
> > > path) to the original Sling resource. The *Adaptable* concept seems to
> > be a
> > > reasonable way of implementing this functionality – to “convert”
> > arbitrary
> > > Sling resource to Solr resource and resolve original Sling resource
> based
> > > on Solr resource.
> > >
> > > Also I think that not all metadata of Sling resource should be used
> when
> > > creating corresponding Solr resource – so this task should also include
> > > some configuration to specify metadata necessary to be passed to Solr
> > > resource. Additionally, some transformations on resource metadata could
> > be
> > > supported here.
> > >
> >
> >
> >
> > I think you should think initially about just getting or resolving Solr
> > resources using the ResourceResolver.
> >
> > Later you can add creating those resources via the
> > ModifyingResourceProvider. If you think of a Resource as a map of
> > properties, then it fits the Solr document model reasonably well. Ie a
> > Resource maps 1:1 with a Solr Document.
> >
> >
> >
> > >
> > > * 2.2.    *When creating Solr resources not all data could be
> efficiently
> > > stored in Solr – for instance, large binary files. If this is the
> > > situation, one could create Sling resource (for instance, FileSystem or
> > > Jackrabbit) and then create Solr resource based on that Sling resource
> –
> > > this’ll allow both efficient search through Solr and effective storing
> > > options. As an optimization, these steps could be done automatically
> > based
> > > on some configuration. So *when Solr resource is created we could
> analyze
> > > it
> > > * (analyze metadata, trying to adapt to certain types) *and create
> > > additional supporting resources in other parts of Sling virtual
> resource
> > > tree if necessary*. What do you think – is it necessary to implement
> such
> > > functionality or 2.1 option will be sufficient? What useful scenarios
> do
> > > you see for this task besides the “large binary” scenario?
> > >
> >
> >
> > Resources may have properties that are streams. How the stream is stored
> > and delivered is an implementation detail of the ResourceProvider and the
> > object it provides. So a SolrResourceProvider might provide SolrResource
> > objects, which expose a SolrResourceDocument when
> > resource.adaptTo(SolrResourceDocument.class) is invoked.
> >
> > The SolrResourceDocument might then have a getBodyStream() method.
> >
> >
> > >
> > > *3.       **Provide solution to support search for arbitrary Sling
> > > resources through Sling API using Solr capabilities.*
> > >
> > > From my point of view this one needs some external solutions to support
> > > things like full index, incremental index, creating different
> schedules,
> > > etc. I see that Solr DataImportHandler or Apache ManifoldCF could be
> > > utilized for this task. So the concept of solution here would be to
> write
> > > necessary implementation so that Sling virtual resource tree could be
> > used
> > > as a data source for one of the components mentioned above. What do you
> > > think about this approach? Could you advice some other alternatives to
> > Solr
> > > DataImportHandler and Apache ManifoldCF for implementing this task?
> > >
> > >
> > >
> > > Also I’ve got couple of questions on Sling API:
> > >
> > >    - Am I right that the “best practice” way to provide bundle with
> > custom
> > > *
> > >    ResourceProvider* implementation is to use Apache Felix Maven SCR
> > Plugin
> > >    and specify certain SCR annotations (like *@Component*, *@Service*
> and
> > >    some others) on corresponding classes – *ResourceProvider* or *
> > >    ResourceProviderFactory* implementation in this case?
> > >
> >
> > IIRC you will implement a ResourceProviderFactory as a @Component with a
> > @Service annotation indicating it implements ResourceProviderFactory
> > interface. It will then build ResourceProvider objects. To check I would
> > need to have a quick look at the API.
> >
> >
> >
> >
> > >
> > >
> > >    - I see that *ResourceResolver* is intended to be used by clients to
> > >    obtain and work with Sling resources. Also it seems to me that it is
> > >    unlikely necessary to create custom *ResourceResolver*
> implementation
> > >    for the Solr integration task. But still, could you please specify
> > some
> > >    valid typical cases when one would need to create custom *
> > >    ResourceResolver*?
> > >
> >
> >
> >
> > Correct, you wont need to create a ResourceResolver.
> >
> >
> > >
> > >
> > >    - Suppose I have configured same resource provider implementation
> > (like
> > >    file system resource provider or possible Solr resource provider)
> > under
> > > two
> > >    urls “/url1” and “/url2”. Now I want to perform *findResources*/*
> > >    queryResources* but only for the resources residing under “/url1”.
> Is
> > it
> > >    possible to limit search results in such way? (Probably I missed
> > > something,
> > >    but looking through source code it seems that query results from all
> > >    queriable resource providers supporting given query language will be
> > >    combined regardless where in the resource tree corresponding
> provider
> > is
> > >    configured)
> > >
> >
> >
> > You may decide to limit searches to path subtrees in the query language
> > itself.
> >
> >
> >
> > >
> > >
> > >
> > > Please write any feedback/thoughts you have after reading this vision –
> > > this’ll really help me to understand details further.
> > >
> > >
> > >
> >
> > Sounds like your getting there, please remember to submit a proposal
> before
> > the deadline if your still interested.
> >
> > Thanks
> > Ian
> >
> >
> >
> > >
> > > Many thanks in advance,
> > >
> > > Ilya
> > >
> >
>

Reply via email to