Excellent!! I'll create a new RC.

Thanks again,
Karl


On Tue, Sep 25, 2018 at 8:13 AM Julien Massiera <
[email protected]> wrote:

> This new fix seems to work. Ingestions and deletions are working and the
> image file with huge metadata is indexed !
>
> Julien
>
>
> On 25/09/2018 13:59, Karl Wright wrote:
> > I've committed a hack to trunk.  It has been tested for Solr Cell
> > documents, deletions, and for tika-connector-extracted documents that
> don't
> > have a lot of metadata.  I'm asking Julien to test it with his specific
> > image that has lots of metadata to see if the pathway for that case works
> > properly.  If it does, I'll spin another RC.
> >
> > Long term, since I'm a Lucene/Solr committer, I think I'm going to have
> to
> > take SolrJ under my wing if we expect it to work for ManifoldCF.  I don't
> > have a lot of time to do stuff like this anymore but clearly neither does
> > the Solr team.
> >
> > Karl
> >
> >
> > On Tue, Sep 25, 2018 at 6:14 AM Karl Wright <[email protected]> wrote:
> >
> >> The back-and-forth is not going well.  Mr. Noble is needing to be
> >> convinced that it is a valid use case for Solr to have metadata longer
> than
> >> 4096 characters.  In fact it seems like the Solr folks have deliberately
> >> been trying to get rid of support for multipart posts for a while,
> because
> >> they don't see the need for them.  I'm still hoping to convince them
> >> otherwise but I'm not getting a positive feel.
> >>
> >> I'm still trying to figure out if multipart posts have any fundamental
> >> conflict with their RequestWriter architecture.  If not I can perhaps
> >> override the RequestWrite implementation and add multipart support that
> >> way.  But it's not going to be a quick process by any means.
> >>
> >>
> >> On Mon, Sep 24, 2018 at 12:13 PM Karl Wright <[email protected]>
> wrote:
> >>
> >>> Hi Julien,
> >>>
> >>> This has nothing to do with the new Tika.
> >>>
> >>> It is not normal; it means that UpdateRequests are not being sent as
> >>> multipart form posts.  It's going to require work from the Solr team
> to fix
> >>> this problem, however, because everything I do to work around the issue
> >>> nonetheless seems to fail. :-(
> >>>
> >>> I'm having a back-and-forth with Paul Noble right now.  I'll update
> >>> accordingly when I know more.
> >>>
> >>> Karl
> >>>
> >>>
> >>> On Mon, Sep 24, 2018 at 11:33 AM Julien Massiera <
> >>> [email protected]> wrote:
> >>>
> >>>> After testing it, it is a +1 for me
> >>>>
> >>>> However, I found a new interesting issue coming with the new Tika
> >>>> version. I had a jpg file for which some metadata were not extracted
> >>>> before, like the RedTRC, BlueTRC and GreenTRC which contain
> >>>> approximatively 2048 bytes of data each. As the metadata are passed to
> >>>> Solr through the URI, I get the following error : URI is too large
> >8192
> >>>>
> >>>> Do we consider it as a "normal issue" or is it worth checking the
> >>>> metadata length before sending the ingest request ?
> >>>>
> >>>>
> >>>> On 24/09/2018 16:43, Karl Wright wrote:
> >>>>> Please vote on whether to release ManifoldCF 2.11, RC3.  This release
> >>>>> contains a number of fixes/improvements/additions, described in the
> >>>>> CHANGES.txt file.  In addition, it includes Tika 1.19, which has a
> >>>> number
> >>>>> of fixes for classpath issues specifically requested by ManifoldCF.
> >>>>>
> >>>>> This completely fixes a SolrJ related problem with the Solr Connector
> >>>> found
> >>>>> in RC3.  All tests pass.
> >>>>>
> >>>>> The release artifact can be found at:
> >>>>>
> >>>>>
> >>>>
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.11
> >>>>> There is also a tag at:
> >>>>>
> >>>>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.11-RC3
> >>>>>
> >>>>> Thanks again,
> >>>>> Karl Wright
> >>>>>
> >>>> --
> >>>> Julien MASSIERA
> >>>> Directeur développement produit
> >>>> France Labs – Les experts du Search
> >>>> Retrouvez-nous à l’Enterprise Search & Discovery Summit à Washington
> DC
> >>>> www.francelabs.com
> >>>>
> >>>>
>
> --
> Julien MASSIERA
> Directeur développement produit
> France Labs – Les experts du Search
> Retrouvez-nous à l’Enterprise Search & Discovery Summit à Washington DC
> www.francelabs.com
>
>

Reply via email to