[jira] [Commented] (TIKA-1843) Tika parser for SEG-Y files and new MIME type application/segy
[ https://issues.apache.org/jira/browse/TIKA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126272#comment-15126272 ] Giovanni Usai commented on TIKA-1843: - Hi Nick, Sigrun owner has merged my modifications, so we can go on with the integration. Do I have to perform the steps as per guide (http://central.sonatype.org/pages/ossrh-guide.html) or they will be done by you? Thanks, Giovanni > Tika parser for SEG-Y files and new MIME type application/segy > -- > > Key: TIKA-1843 > URL: https://issues.apache.org/jira/browse/TIKA-1843 > Project: Tika > Issue Type: New Feature > Components: mime, parser >Reporter: Giovanni Usai >Priority: Minor > > This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and > .sgy). > The SEG-Y format is used to store seismic data, you can find more information > here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM. > I have: > - added a new MIME type application/segy matching the file name extensions > .segy, .seg and .sgy. > - created a new SEGYParser, matching that MIME type. > In order to parse the SEG-Y files, I am using a modified version of the > sigrun code (available under Apache license, here > https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and > changed some method signatures to be able to read from a ReadableByteChannel > instead of FileChannel. > For the moment I have put it directly into the new Tika's segy package. Is > this the right thing to do or should I reference it as external library thus > modifying the pom.xml? > Thanks and best regards, > Giovanni -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1843) Tika parser for SEG-Y files and new MIME type application/segy
[ https://issues.apache.org/jira/browse/TIKA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126450#comment-15126450 ] Nick Burch commented on TIKA-1843: -- Ideally you'd work with the Sigrun owner to have them do it - it's best if the people who "own" the code and "do" the releases are also the ones who push the files to Maven central. (Doesn't have to be, there is the third party process, but it's certainly preferred) If I were you, I'd review the docs, then suggest any POM fixes to them. Once those are in, work with the Sigrun team to get them to request their access + get things uploaded If you need an example project to crib from for the pom, my own https://github.com/Gagravarr/VorbisJava/blob/master/parent/pom.xml is one place you could start (amongst others!) > Tika parser for SEG-Y files and new MIME type application/segy > -- > > Key: TIKA-1843 > URL: https://issues.apache.org/jira/browse/TIKA-1843 > Project: Tika > Issue Type: New Feature > Components: mime, parser >Reporter: Giovanni Usai >Priority: Minor > > This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and > .sgy). > The SEG-Y format is used to store seismic data, you can find more information > here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM. > I have: > - added a new MIME type application/segy matching the file name extensions > .segy, .seg and .sgy. > - created a new SEGYParser, matching that MIME type. > In order to parse the SEG-Y files, I am using a modified version of the > sigrun code (available under Apache license, here > https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and > changed some method signatures to be able to read from a ReadableByteChannel > instead of FileChannel. > For the moment I have put it directly into the new Tika's segy package. Is > this the right thing to do or should I reference it as external library thus > modifying the pom.xml? > Thanks and best regards, > Giovanni -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1843) Tika parser for SEG-Y files and new MIME type application/segy
[ https://issues.apache.org/jira/browse/TIKA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123568#comment-15123568 ] Giovanni Usai commented on TIKA-1843: - Hi Nick, thanks for the fast reply! The last sigrun commit (the one of some days ago) is mine; I have had to rename a class to make sigrun compile. Apart from that, no other commits in 1 year. Anyway, no problem, I will submit my modifications to sigrun and I will come back to you once my pull will be merged. Please note that sigrun artifact is not installed in any Maven repository yet, as far as I know. Thanks again! > Tika parser for SEG-Y files and new MIME type application/segy > -- > > Key: TIKA-1843 > URL: https://issues.apache.org/jira/browse/TIKA-1843 > Project: Tika > Issue Type: New Feature > Components: mime, parser >Reporter: Giovanni Usai >Priority: Minor > > This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and > .sgy). > The SEG-Y format is used to store seismic data, you can find more information > here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM. > I have: > - added a new MIME type application/segy matching the file name extensions > .segy, .seg and .sgy. > - created a new SEGYParser, matching that MIME type. > In order to parse the SEG-Y files, I am using a modified version of the > sigrun code (available under Apache license, here > https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and > changed some method signatures to be able to read from a ReadableByteChannel > instead of FileChannel. > For the moment I have put it directly into the new Tika's segy package. Is > this the right thing to do or should I reference it as external library thus > modifying the pom.xml? > Thanks and best regards, > Giovanni -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1843) Tika parser for SEG-Y files and new MIME type application/segy
[ https://issues.apache.org/jira/browse/TIKA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123552#comment-15123552 ] Nick Burch commented on TIKA-1843: -- Looks like Sigrun is an active project, so best bet would be to submit Github pull requests to them to add the `ReadableByteChannel` support. Then, once they've added that + released, we'll add a Tika dependency to that + add the parser code ASF best-practice is to avoid forking upstream projects + bundling modified versions whenever possible, so putting customised versions of Segrun classes in the Tika segy package should be avoided if possible. Much better to get them to accept the fixes upstream! > Tika parser for SEG-Y files and new MIME type application/segy > -- > > Key: TIKA-1843 > URL: https://issues.apache.org/jira/browse/TIKA-1843 > Project: Tika > Issue Type: New Feature > Components: mime, parser >Reporter: Giovanni Usai >Priority: Minor > > This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and > .sgy). > The SEG-Y format is used to store seismic data, you can find more information > here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM. > I have: > - added a new MIME type application/segy matching the file name extensions > .segy, .seg and .sgy. > - created a new SEGYParser, matching that MIME type. > In order to parse the SEG-Y files, I am using a modified version of the > sigrun code (available under Apache license, here > https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and > changed some method signatures to be able to read from a ReadableByteChannel > instead of FileChannel. > For the moment I have put it directly into the new Tika's segy package. Is > this the right thing to do or should I reference it as external library thus > modifying the pom.xml? > Thanks and best regards, > Giovanni -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1843) Tika parser for SEG-Y files and new MIME type application/segy
[ https://issues.apache.org/jira/browse/TIKA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123612#comment-15123612 ] Nick Burch commented on TIKA-1843: -- Getting a maven-built project into the Sonatype OSS repo for maven use isn't too bad. Ideally we'd work with the Sigrun team to get their POM into shape so it can be released as per http://central.sonatype.org/pages/ossrh-guide.html , otherwise we can take over and upload it for them as a third party. Ask on the dev list for help with any of those if needed, we've several people well experienced in both routes! > Tika parser for SEG-Y files and new MIME type application/segy > -- > > Key: TIKA-1843 > URL: https://issues.apache.org/jira/browse/TIKA-1843 > Project: Tika > Issue Type: New Feature > Components: mime, parser >Reporter: Giovanni Usai >Priority: Minor > > This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and > .sgy). > The SEG-Y format is used to store seismic data, you can find more information > here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM. > I have: > - added a new MIME type application/segy matching the file name extensions > .segy, .seg and .sgy. > - created a new SEGYParser, matching that MIME type. > In order to parse the SEG-Y files, I am using a modified version of the > sigrun code (available under Apache license, here > https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and > changed some method signatures to be able to read from a ReadableByteChannel > instead of FileChannel. > For the moment I have put it directly into the new Tika's segy package. Is > this the right thing to do or should I reference it as external library thus > modifying the pom.xml? > Thanks and best regards, > Giovanni -- This message was sent by Atlassian JIRA (v6.3.4#6332)