[jira] [Commented] (TIKA-1562) Add examples from the Tika in Action book

2015-02-26 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338988#comment-14338988
 ] 

Vineet Ghatge commented on TIKA-1562:
-

Also may I suggest adding chapter about case studies, to make sure that people 
know in what kind of situations they could use tika?

> Add examples from the Tika in Action book
> -
>
> Key: TIKA-1562
> URL: https://issues.apache.org/jira/browse/TIKA-1562
> Project: Tika
>  Issue Type: Bug
>  Components: example
>Reporter: Chris A. Mattmann
>Assignee: Chris A. Mattmann
> Fix For: 1.8
>
>
> Manning publications has granted permission for me to contribute the Tika in 
> Action code to Apache TIka. Yay! I'll put it in the examples module and 
> update it if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2015-01-29 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297709#comment-14297709
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~lewismc] I just tested it on my end as well. It works well. Thanks everyone 
for pushing this through!

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, 
> TIKA-1423.patch, TIKA-1423v2.patch, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-10 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241607#comment-14241607
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~tpalsulich]: Agreed, I made little more progress, I have gotten past the 
errors that I last reported. I have pushed in git. Now, I am getting this.  
Stack talks about - sgi.wiring.package=thredds.catalog not sure what it is.
ERROR: Bundle org.apache.tika.bundle [18] Error starting 
file:/home/afox/tika/tika/tika-bundle/target/test-bundles/tika-bundle.jar 
(org.osgi.framework.BundleException: Unresolved constraint in bundle 
org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] 
osgi.wiring.package; (osgi.wiring.package=thredds.catalog))
org.osgi.framework.BundleException: Unresolved constraint in bundle 
org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] 
osgi.wiring.package; (osgi.wiring.package=thredds.catalog)
at 
org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3980)
at org.apache.felix.framework.Felix.startBundle(Felix.java:2043)
at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1297)
at 
org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
at java.lang.Thread.run(Thread.java:745)
544 [main] ERROR org.ops4j.pax.exam.nat.internal.NativeTestContainer - Bundle 
[org.apache.tika.bundle [18]] is not resolved
878 [main] INFO org.ops4j.pax.exam.spi.reactors.ReactorManager - suite finished
Tests run: 6, Failures: 1, Errors: 0, Skipped: 3, Time elapsed: 0.945 sec <<< 
FAILURE!


> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, 
> TIKA-1423.patch, fileName.html, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-08 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238572#comment-14238572
 ] 

Vineet Ghatge commented on TIKA-1423:
-

I was able to get past the above errors with help of hints in TIKA - 1276, but 
I am unable to get past the following error 

ERROR: Bundle org.apache.tika.bundle [18] Error starting 
file:/home/afox/tika/tika/tika-bundle/target/test-bundles/tika-bundle.jar 
(org.osgi.framework.BundleException: Unresolved constraint in bundle 
org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] 
osgi.wiring.package; (osgi.wiring.package=ucar.ma2))
org.osgi.framework.BundleException: Unresolved constraint in bundle 
org.apache.tika.bundle [18]: Unable to resolve 18.0: missing requirement [18.0] 
osgi.wiring.package; (osgi.wiring.package=ucar.ma2)
at 
org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3980)
at org.apache.felix.framework.Felix.startBundle(Felix.java:2043)
at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1297)
at 
org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
at java.lang.Thread.run(Thread.java:745)
549 [main] ERROR org.ops4j.pax.exam.nat.internal.NativeTestContainer - Bundle 
[org.apache.tika.bundle [18]] is not resolved

I have pushed the changes so far at 
https://github.com/hemantku/tika/tree/TIKA-1423
 What does the above callstack indicate? My guess is that we need to mention 
ucar.ma2 in the pom.xml but that does not seem to be helping

Any thoughts?[~chrismattmann] [~lewismc] [~annieburgess] [~tpalsulich]


> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, 
> TIKA-1423.patch, fileName.html, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-06 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236772#comment-14236772
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~tpalsulich]: It seems similar to the issue at 
https://issues.apache.org/jira/browse/TIKA-1276.I applied the patch at 
TIKA-1276_20140428_3_rwesten.diff, but I am getting new error

Tests run: 6, Failures: 0, Errors: 6, Skipped: 0, Time elapsed: 0.169 sec <<< 
FAILURE!
initializationError(org.apache.tika.bundle.BundleIT)  Time elapsed: 0.005 sec  
<<< ERROR!
java.lang.Exception: Method testBundleSimpleText should have no parameters
at 
org.junit.runners.model.FrameworkMethod.validatePublicVoidNoArg(FrameworkMethod.java:72)
at 
org.junit.runners.ParentRunner.validatePublicVoidNoArgMethods(ParentRunner.java:133)
at 
org.junit.runners.BlockJUnit4ClassRunner.validateTestMethods(BlockJUnit4ClassRunner.java:186)
at 
org.junit.runners.BlockJUnit4ClassRunner.validateInstanceMethods(BlockJUnit4ClassRunner.java:166)
at 
org.junit.runners.BlockJUnit4ClassRunner.collectInitializationErrors(BlockJUnit4ClassRunner.java:104)
at org.junit.runners.ParentRunner.validate(ParentRunner.java:355)
at org.junit.runners.ParentRunner.(ParentRunner.java:76)
at 
org.junit.runners.BlockJUnit4ClassRunner.(BlockJUnit4ClassRunner.java:57)
at org.ops4j.pax.exam.junit.impl.ProbeRunner.(ProbeRunner.java:74)
at org.ops4j.pax.exam.junit.PaxExam.createDelegate(PaxExam.java:82)
at org.ops4j.pax.exam.junit.PaxExam.(PaxExam.java:73)
at 
org.ops4j.pax.exam.junit.JUnit4TestRunner.(JUnit4TestRunner.java:30)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:29)
at 
org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:21)
at 
org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
at 
org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
at 
org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
at 
org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:26)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:51)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at 
org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at 
org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)


> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, 
> TIKA-1423.patch, fileName.html, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmis

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-06 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236690#comment-14236690
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~chrismattmann] : Please find the patch attached

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.patch, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-06 Thread Vineet Ghatge (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Ghatge updated TIKA-1423:

Attachment: TIKA-1423.patch

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.patch, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-05 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236206#comment-14236206
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~tpalsulich]: What is the error you get when the tests fail? Is it same as not 
a valid CDM file?

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-05 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236124#comment-14236124
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~tpalsulich]: Should be this one - 
https://artifacts.unidata.ucar.edu/content/repositories/unidata-releases/edu/ucar/grib/4.5.3/
 ?  I am not sure if we do require this, the parser should netcdf jar's only

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-12-05 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236106#comment-14236106
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~tpalsulich] : Thanks for helping me out! I think it will better to use the 
following 
Thredds Parent pom is available - 
http://search.maven.org/#artifactdetails|edu.ucar|thredds-parent|4.5.3|pom
netCDF 4.5.3 is available - 
http://search.maven.org/#artifactdetails|edu.ucar|netcdf4|4.5.3|jar
I will push the changes from my repo so that they are sync with review board

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-30 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191333#comment-14191333
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Ok. I submitted the patch Review Request #2741. Please let me know if you have 
any questions.

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-30 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191249#comment-14191249
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Hi [~tpalsulich], I tried that, it fails even after deleting it from there. Can 
I push the jar file into repo? or manually pass it as a classpath?

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-30 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191228#comment-14191228
 ] 

Vineet Ghatge commented on TIKA-1423:
-

This was working well last week.

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-30 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191225#comment-14191225
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Hi [~chrismattmann] [~lewismc],

I am not sure if something changed in Maven or tika, but my tika build keeps 
failing with following error 

 Failed to read artifact descriptor for edu.ucar:netcdf:jar:4.3.22: Failure to 
find edu.ucar:thredds-parent:pom:4.3.22 in http://repo.maven.apache.org/maven2 
was cached in the local repository, resolution will not be reattempted until 
the update interval of central has elapsed or updates are forced -> [Help 1]

The local repository - http://search.maven.org/#search%7Cga%7C1%7Cnetcdf  has 
4.3.22 which was on the system since august 2014. I have added the following 
tags in tika-parsers/pom.xml

 
edu.ucar
netcdf
4.3.22
 

Am I missing something else? Any pointer will be helpful.

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-25 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184312#comment-14184312
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~lewismc]: Tried the mvn dependency:analyze-report and it did not show any 
conflicts
// --
// Transitive dependencies of this project determined from the
// maven pom organized by organization.
// --

Apache Tika
I tried excluding the dependencies even that did not see resolve it. I finished 
the test case which I have attached here.

I am trying file patch in reviewe.apache.org, but I am not sure about workflow. 
Is there a document which says on what I need to submit?

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-25 Thread Vineet Ghatge (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Ghatge updated TIKA-1423:

Attachment: GRIBParsertest.java

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-23 Thread Vineet Ghatge (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Ghatge updated TIKA-1423:

Attachment: fileName.html

Output in HTML

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, 
> gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-23 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182194#comment-14182194
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Consumed the Parser to get data in HTML format and it works. I have attached 
the output to the documents. There is an issue with netCDFall4.5 jar keeps 
displaying these warnings 

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/vineghlinux/Desktop/CoursesFall2014/CSCI572/DR/netcdfAll-4.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/vineghlinux/Desktop/CoursesFall2014/CSCI572/DR/tika-app-1.7-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/vineghlinux/Desktop/CoursesFall2014/CSCI572/DR/slf4j-simple-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]

Tried to change the pom.xml of the tika, but that did not work either. Trying 
to remedy based on http://www.slf4j.org/codes.html#multiple_binding and 
http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/reference/JarDependencies.html

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-22 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180721#comment-14180721
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Sure, I will inculde the testcase. Thanks!

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-21 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178052#comment-14178052
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Hey [~lewismc] I am working on it, I will post my updates this week. You can 
assign this to me. 

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Assignee: Lewis John McGibbney
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-02 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156262#comment-14156262
 ] 

Vineet Ghatge commented on TIKA-1423:
-

UPDATE:
So I picked from some conversation that Annie and Christian Ward from Netcdf - 
http://www.unidata.ucar.edu/mailing_lists/archives/netcdf-java/2014/msg00091.html
  and seems like there was sample provided which I used to run and it gives out 
the GRIB2 data

import java.io.IOException;
import java.io.File;
import ucar.nc2.NetcdfFile;
import ucar.nc2.dataset.NetcdfDataset;

public class Foo {
public static void main(String[] args) throws IOException {
File gribFile = new File("gdas1.forecmwf.2014062612.grib2");
NetcdfFile ncFile = NetcdfDataset.openFile(gribFile.getAbsolutePath(), 
null);
System.out.println("Success!");
try {
System.out.println(ncFile.toString());
} finally {
ncFile.close();
}
}
}
This parses and loads the GRIB2 format and I am currently working on getting 
Annie's code  and changing class path references

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-02 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156146#comment-14156146
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Thanks [~lewismc]

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, 
> NLDAS_FORA0125_H.A20130112.1200.002.grb, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-09-30 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154360#comment-14154360
 ] 

Vineet Ghatge commented on TIKA-1423:
-

[~annieburgess]  Do you have another sample GRIB2 file? The current file seems 
to be breaking out with following error 

Caused by: java.io.IOException: Cant read 
/home/vineghlinux/workspace/DR/gdas1.forecmwf.2014062612.grib2: not a valid CDM 
file.
at ucar.nc2.NetcdfFile.open(NetcdfFile.java:734)
at ucar.nc2.NetcdfFile.open(NetcdfFile.java:384)
at 
ucar.nc2.dataset.NetcdfDataset.openOrAcquireFile(NetcdfDataset.java:687)
at ucar.nc2.dataset.NetcdfDataset.openFile(NetcdfDataset.java:564)
at grib.grib.parse(grib.java:69)
... 1 more



> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-09-25 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148398#comment-14148398
 ] 

Vineet Ghatge commented on TIKA-1423:
-

Pulling up the data and JAR file and trying to setup environment for the 
failing scenario

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-09-22 Thread Vineet Ghatge (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144013#comment-14144013
 ] 

Vineet Ghatge commented on TIKA-1423:
-

 [~lewismc] I have to take a look at parser given by [~annieburgess]. Thanks 
for sharing what you have developed.  I plan to develop based on wrapper  - 
http://www.unidata.ucar.edu/blogs/news/entry/netcdf_java_library_and_tds7.  
There is a recent version of 4.3 which is supposedly capable to handling both 
the GRIB 1 and GRIB 2. I need to explore this and I am guessing the sample 
parser works with this. I will post my updates as I progress. Thanks

> Build a parser to extract data from GRIB formats
> 
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
>  Issue Type: New Feature
>  Components: metadata, mime, parser
>Affects Versions: 1.6
>Reporter: Vineet Ghatge
>Priority: Critical
>  Labels: features, newbie
> Fix For: 1.7
>
> Attachments: GribParser.java, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form 
> http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is 
> a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  
> The focus will be on GRIB 2 which is the most prevalent. Each GRIB record 
> intended for either transmission or storage contains a single parameter with 
> values located at an array of grid points, or represented as a set of 
> spectral coefficients, for a single level (or layer), encoded as a continuous 
> bit stream. Logical divisions of the record are designated as "sections", 
> each of which provides control information and/or data. A GRIB record 
> consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-09-21 Thread Vineet Ghatge (JIRA)
Vineet Ghatge created TIKA-1423:
---

 Summary: Build a parser to extract data from GRIB formats
 Key: TIKA-1423
 URL: https://issues.apache.org/jira/browse/TIKA-1423
 Project: Tika
  Issue Type: New Feature
  Components: metadata, mime, parser
Affects Versions: 1.6
Reporter: Vineet Ghatge
Priority: Critical
 Fix For: 1.7


Arctic dataset contains a MIME format called GRIB -  General 
Regularly­distributed information in Binary form 
http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is a 
concise data format used in meteorology to store historical and 
weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  The 
focus will be on GRIB 2 which is the most prevalent. Each GRIB record intended 
for either transmission or storage contains a single parameter with values 
located at an array of grid points, or represented as a set of spectral 
coefficients, for a single level (or layer), encoded as a continuous bit 
stream. Logical divisions of the record are designated as "sections", each of 
which provides control information and/or data. A GRIB record consists of six 
sections, two of which are optional: 
 
(0) Indicator Section 
(1) Product Definition Section (PDS) 
(2) Grid Description Section (GDS) ­ optional 
(3) Bit Map Section (BMS) ­ optional 
(4) Binary Data Section (BDS) 
(5) '' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)