[jira] [Commented] (TIKA-1562) Add examples from the Tika in Action book
[ https://issues.apache.org/jira/browse/TIKA-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338988#comment-14338988 ] Vineet Ghatge commented on TIKA-1562: - Also may I suggest adding chapter about case studies, to make sure that people know in what kind of situations they could use tika? > Add examples from the Tika in Action book > - > > Key: TIKA-1562 > URL: https://issues.apache.org/jira/browse/TIKA-1562 > Project: Tika > Issue Type: Bug > Components: example >Reporter: Chris A. Mattmann >Assignee: Chris A. Mattmann > Fix For: 1.8 > > > Manning publications has granted permission for me to contribute the Tika in > Action code to Apache TIka. Yay! I'll put it in the examples module and > update it if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297709#comment-14297709 ] Vineet Ghatge commented on TIKA-1423: - [~lewismc] I just tested it on my end as well. It works well. Thanks everyone for pushing this through! > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, > TIKA-1423.patch, TIKA-1423v2.patch, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241607#comment-14241607 ] Vineet Ghatge commented on TIKA-1423: - [~tpalsulich]: Agreed, I made little more progress, I have gotten past the errors that I last reported. I have pushed in git. Now, I am getting this. Stack talks about - sgi.wiring.package=thredds.catalog not sure what it is. ERROR: Bundle org.apache.tika.bundle [18] Error starting file:/home/afox/tika/tika/tika-bundle/target/test-bundles/tika-bundle.jar (org.osgi.framework.BundleException: Unresolved constraint in bundle org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] osgi.wiring.package; (osgi.wiring.package=thredds.catalog)) org.osgi.framework.BundleException: Unresolved constraint in bundle org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] osgi.wiring.package; (osgi.wiring.package=thredds.catalog) at org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3980) at org.apache.felix.framework.Felix.startBundle(Felix.java:2043) at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1297) at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304) at java.lang.Thread.run(Thread.java:745) 544 [main] ERROR org.ops4j.pax.exam.nat.internal.NativeTestContainer - Bundle [org.apache.tika.bundle [18]] is not resolved 878 [main] INFO org.ops4j.pax.exam.spi.reactors.ReactorManager - suite finished Tests run: 6, Failures: 1, Errors: 0, Skipped: 3, Time elapsed: 0.945 sec <<< FAILURE! > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, > TIKA-1423.patch, fileName.html, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238572#comment-14238572 ] Vineet Ghatge commented on TIKA-1423: - I was able to get past the above errors with help of hints in TIKA - 1276, but I am unable to get past the following error ERROR: Bundle org.apache.tika.bundle [18] Error starting file:/home/afox/tika/tika/tika-bundle/target/test-bundles/tika-bundle.jar (org.osgi.framework.BundleException: Unresolved constraint in bundle org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] osgi.wiring.package; (osgi.wiring.package=ucar.ma2)) org.osgi.framework.BundleException: Unresolved constraint in bundle org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] osgi.wiring.package; (osgi.wiring.package=ucar.ma2) at org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3980) at org.apache.felix.framework.Felix.startBundle(Felix.java:2043) at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1297) at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304) at java.lang.Thread.run(Thread.java:745) 549 [main] ERROR org.ops4j.pax.exam.nat.internal.NativeTestContainer - Bundle [org.apache.tika.bundle [18]] is not resolved I have pushed the changes so far at https://github.com/hemantku/tika/tree/TIKA-1423 What does the above callstack indicate? My guess is that we need to mention ucar.ma2 in the pom.xml but that does not seem to be helping Any thoughts?[~chrismattmann] [~lewismc] [~annieburgess] [~tpalsulich] > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, > TIKA-1423.patch, fileName.html, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236772#comment-14236772 ] Vineet Ghatge commented on TIKA-1423: - [~tpalsulich]: It seems similar to the issue at https://issues.apache.org/jira/browse/TIKA-1276.I applied the patch at TIKA-1276_20140428_3_rwesten.diff, but I am getting new error Tests run: 6, Failures: 0, Errors: 6, Skipped: 0, Time elapsed: 0.169 sec <<< FAILURE! initializationError(org.apache.tika.bundle.BundleIT) Time elapsed: 0.005 sec <<< ERROR! java.lang.Exception: Method testBundleSimpleText should have no parameters at org.junit.runners.model.FrameworkMethod.validatePublicVoidNoArg(FrameworkMethod.java:72) at org.junit.runners.ParentRunner.validatePublicVoidNoArgMethods(ParentRunner.java:133) at org.junit.runners.BlockJUnit4ClassRunner.validateTestMethods(BlockJUnit4ClassRunner.java:186) at org.junit.runners.BlockJUnit4ClassRunner.validateInstanceMethods(BlockJUnit4ClassRunner.java:166) at org.junit.runners.BlockJUnit4ClassRunner.collectInitializationErrors(BlockJUnit4ClassRunner.java:104) at org.junit.runners.ParentRunner.validate(ParentRunner.java:355) at org.junit.runners.ParentRunner.(ParentRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.(BlockJUnit4ClassRunner.java:57) at org.ops4j.pax.exam.junit.impl.ProbeRunner.(ProbeRunner.java:74) at org.ops4j.pax.exam.junit.PaxExam.createDelegate(PaxExam.java:82) at org.ops4j.pax.exam.junit.PaxExam.(PaxExam.java:73) at org.ops4j.pax.exam.junit.JUnit4TestRunner.(JUnit4TestRunner.java:30) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:29) at org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:21) at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59) at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26) at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59) at org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:26) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:51) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110) at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175) at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68) > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, > TIKA-1423.patch, fileName.html, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmis
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236690#comment-14236690 ] Vineet Ghatge commented on TIKA-1423: - [~chrismattmann] : Please find the patch attached > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.patch, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Ghatge updated TIKA-1423: Attachment: TIKA-1423.patch > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.patch, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236206#comment-14236206 ] Vineet Ghatge commented on TIKA-1423: - [~tpalsulich]: What is the error you get when the tests fail? Is it same as not a valid CDM file? > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236124#comment-14236124 ] Vineet Ghatge commented on TIKA-1423: - [~tpalsulich]: Should be this one - https://artifacts.unidata.ucar.edu/content/repositories/unidata-releases/edu/ucar/grib/4.5.3/ ? I am not sure if we do require this, the parser should netcdf jar's only > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236106#comment-14236106 ] Vineet Ghatge commented on TIKA-1423: - [~tpalsulich] : Thanks for helping me out! I think it will better to use the following Thredds Parent pom is available - http://search.maven.org/#artifactdetails|edu.ucar|thredds-parent|4.5.3|pom netCDF 4.5.3 is available - http://search.maven.org/#artifactdetails|edu.ucar|netcdf4|4.5.3|jar I will push the changes from my repo so that they are sync with review board > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191333#comment-14191333 ] Vineet Ghatge commented on TIKA-1423: - Ok. I submitted the patch Review Request #2741. Please let me know if you have any questions. > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191249#comment-14191249 ] Vineet Ghatge commented on TIKA-1423: - Hi [~tpalsulich], I tried that, it fails even after deleting it from there. Can I push the jar file into repo? or manually pass it as a classpath? > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191228#comment-14191228 ] Vineet Ghatge commented on TIKA-1423: - This was working well last week. > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191225#comment-14191225 ] Vineet Ghatge commented on TIKA-1423: - Hi [~chrismattmann] [~lewismc], I am not sure if something changed in Maven or tika, but my tika build keeps failing with following error Failed to read artifact descriptor for edu.ucar:netcdf:jar:4.3.22: Failure to find edu.ucar:thredds-parent:pom:4.3.22 in http://repo.maven.apache.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1] The local repository - http://search.maven.org/#search%7Cga%7C1%7Cnetcdf has 4.3.22 which was on the system since august 2014. I have added the following tags in tika-parsers/pom.xml edu.ucar netcdf 4.3.22 Am I missing something else? Any pointer will be helpful. > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184312#comment-14184312 ] Vineet Ghatge commented on TIKA-1423: - [~lewismc]: Tried the mvn dependency:analyze-report and it did not show any conflicts // -- // Transitive dependencies of this project determined from the // maven pom organized by organization. // -- Apache Tika I tried excluding the dependencies even that did not see resolve it. I finished the test case which I have attached here. I am trying file patch in reviewe.apache.org, but I am not sure about workflow. Is there a document which says on what I need to submit? > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Ghatge updated TIKA-1423: Attachment: GRIBParsertest.java > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.8 > > Attachments: GRIBParsertest.java, GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Ghatge updated TIKA-1423: Attachment: fileName.html Output in HTML > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, > gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182194#comment-14182194 ] Vineet Ghatge commented on TIKA-1423: - Consumed the Parser to get data in HTML format and it works. I have attached the output to the documents. There is an issue with netCDFall4.5 jar keeps displaying these warnings SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/vineghlinux/Desktop/CoursesFall2014/CSCI572/DR/netcdfAll-4.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/vineghlinux/Desktop/CoursesFall2014/CSCI572/DR/tika-app-1.7-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/vineghlinux/Desktop/CoursesFall2014/CSCI572/DR/slf4j-simple-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] Tried to change the pom.xml of the tika, but that did not work either. Trying to remedy based on http://www.slf4j.org/codes.html#multiple_binding and http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/reference/JarDependencies.html > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180721#comment-14180721 ] Vineet Ghatge commented on TIKA-1423: - Sure, I will inculde the testcase. Thanks! > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178052#comment-14178052 ] Vineet Ghatge commented on TIKA-1423: - Hey [~lewismc] I am working on it, I will post my updates this week. You can assign this to me. > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Assignee: Lewis John McGibbney >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156262#comment-14156262 ] Vineet Ghatge commented on TIKA-1423: - UPDATE: So I picked from some conversation that Annie and Christian Ward from Netcdf - http://www.unidata.ucar.edu/mailing_lists/archives/netcdf-java/2014/msg00091.html and seems like there was sample provided which I used to run and it gives out the GRIB2 data import java.io.IOException; import java.io.File; import ucar.nc2.NetcdfFile; import ucar.nc2.dataset.NetcdfDataset; public class Foo { public static void main(String[] args) throws IOException { File gribFile = new File("gdas1.forecmwf.2014062612.grib2"); NetcdfFile ncFile = NetcdfDataset.openFile(gribFile.getAbsolutePath(), null); System.out.println("Success!"); try { System.out.println(ncFile.toString()); } finally { ncFile.close(); } } } This parses and loads the GRIB2 format and I am currently working on getting Annie's code and changing class path references > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156146#comment-14156146 ] Vineet Ghatge commented on TIKA-1423: - Thanks [~lewismc] > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, > NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154360#comment-14154360 ] Vineet Ghatge commented on TIKA-1423: - [~annieburgess] Do you have another sample GRIB2 file? The current file seems to be breaking out with following error Caused by: java.io.IOException: Cant read /home/vineghlinux/workspace/DR/gdas1.forecmwf.2014062612.grib2: not a valid CDM file. at ucar.nc2.NetcdfFile.open(NetcdfFile.java:734) at ucar.nc2.NetcdfFile.open(NetcdfFile.java:384) at ucar.nc2.dataset.NetcdfDataset.openOrAcquireFile(NetcdfDataset.java:687) at ucar.nc2.dataset.NetcdfDataset.openFile(NetcdfDataset.java:564) at grib.grib.parse(grib.java:69) ... 1 more > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148398#comment-14148398 ] Vineet Ghatge commented on TIKA-1423: - Pulling up the data and JAR file and trying to setup environment for the failing scenario > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144013#comment-14144013 ] Vineet Ghatge commented on TIKA-1423: - [~lewismc] I have to take a look at parser given by [~annieburgess]. Thanks for sharing what you have developed. I plan to develop based on wrapper - http://www.unidata.ucar.edu/blogs/news/entry/netcdf_java_library_and_tds7. There is a recent version of 4.3 which is supposedly capable to handling both the GRIB 1 and GRIB 2. I need to explore this and I am guessing the sample parser works with this. I will post my updates as I progress. Thanks > Build a parser to extract data from GRIB formats > > > Key: TIKA-1423 > URL: https://issues.apache.org/jira/browse/TIKA-1423 > Project: Tika > Issue Type: New Feature > Components: metadata, mime, parser >Affects Versions: 1.6 >Reporter: Vineet Ghatge >Priority: Critical > Labels: features, newbie > Fix For: 1.7 > > Attachments: GribParser.java, gdas1.forecmwf.2014062612.grib2 > > > Arctic dataset contains a MIME format called GRIB - General > Regularlydistributed information in Binary form > http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is > a concise data format used in meteorology to store historical and > weather data. There are 2 different types of the format GRIB 0, GRIB 2. > The focus will be on GRIB 2 which is the most prevalent. Each GRIB record > intended for either transmission or storage contains a single parameter with > values located at an array of grid points, or represented as a set of > spectral coefficients, for a single level (or layer), encoded as a continuous > bit stream. Logical divisions of the record are designated as "sections", > each of which provides control information and/or data. A GRIB record > consists of six sections, two of which are optional: > > (0) Indicator Section > (1) Product Definition Section (PDS) > (2) Grid Description Section (GDS) optional > (3) Bit Map Section (BMS) optional > (4) Binary Data Section (BDS) > (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TIKA-1423) Build a parser to extract data from GRIB formats
Vineet Ghatge created TIKA-1423: --- Summary: Build a parser to extract data from GRIB formats Key: TIKA-1423 URL: https://issues.apache.org/jira/browse/TIKA-1423 Project: Tika Issue Type: New Feature Components: metadata, mime, parser Affects Versions: 1.6 Reporter: Vineet Ghatge Priority: Critical Fix For: 1.7 Arctic dataset contains a MIME format called GRIB - General Regularlydistributed information in Binary form http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is a concise data format used in meteorology to store historical and weather data. There are 2 different types of the format GRIB 0, GRIB 2. The focus will be on GRIB 2 which is the most prevalent. Each GRIB record intended for either transmission or storage contains a single parameter with values located at an array of grid points, or represented as a set of spectral coefficients, for a single level (or layer), encoded as a continuous bit stream. Logical divisions of the record are designated as "sections", each of which provides control information and/or data. A GRIB record consists of six sections, two of which are optional: (0) Indicator Section (1) Product Definition Section (PDS) (2) Grid Description Section (GDS) optional (3) Bit Map Section (BMS) optional (4) Binary Data Section (BDS) (5) '' (ASCII Characters) -- This message was sent by Atlassian JIRA (v6.3.4#6332)