[jira] [Commented] (SOLR-10981) Allow update to load gzip files

JIRA Fri, 30 Jun 2017 02:04:34 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16069747#comment-16069747
 ]


Jan Høydahl commented on SOLR-10981:
------------------------------------

This is a duplicate of SOLR-7925, although with different solution proposals. 
Have not looked at the patches in detail.

How would this work if you want to post an XML, JSON or PDF? Today our handlers 
rely on {{content-type}} header to select the right update handler. If we want 
generic support for {{Content-Type: application/gzip}} then how to know what 
content type is inside the uncompressed stream?

> Allow update to load gzip files 
> --------------------------------
>
>                 Key: SOLR-10981
>                 URL: https://issues.apache.org/jira/browse/SOLR-10981
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrJ
>    Affects Versions: 6.6
>            Reporter: Andrew Lundgren
>              Labels: patch
>             Fix For: 4.10.4, 6.6, master (7.0)
>
>         Attachments: SOLR-10981.patch
>
>
> We currently import large CSV files.  We store them in gzip files as they 
> compress at around 80%.
> To import them we must gunzip them and then import them.  After that we no 
> longer need the decompressed files.
> This patch allows directly opening either URL, or local files that are 
> gzipped.
> For URLs, to determine if the file is gzipped, it will check the content 
> encoding=="gzip" or if the file ends in ".gz"
> For files, if the file ends in ".gz" then it will assume the file is gzipped.
> I have tested the patch with 4.10.4, 6.6.0 and master from git.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10981) Allow update to load gzip files

Reply via email to