Re: [gdal-dev] Neatline for USGS PDF maps

2013-04-17 Thread Eli Adam
I found this in drafts and it appears I failed to send it.  Sorry for
delay.  Sent partly for the list archives at this point.

On Sat, Jan 19, 2013 at 8:14 AM, Even Rouault
even.roua...@mines-paris.org wrote:
 Looking more closely at those files, I see that there are various
 registration blocks. The algorithm up to now was to select the
 registration block whose neatline covered the most area in terms of
 pixels. In the case of
 OR_Newport_North_20110824_TM_geo.pdf, those blocks are :
 UTM Grid and Projection
 Orthoimage
 Map Layers
 Adjoining Quadrangles Diagram

 The number and names of blocks may change, but in all USGS topo PDFs
 samples I've tried, the Map Layers is always present and seems to the
 one that lead to the best results, so I've just pushed a change to select
 it when it is found.

--config GDAL_PDF_NEATLINE is very helpful.  Did you find the
registration name blocks with one of the supporting PDF libraries?  Is
it possible to find these multiple registration name blocks from gdal?
 (I tried: gdalinfo --debug on without success.)

Thanks for the many recent improvements for the USGS topo PDFs.
--config GDAL_PDF_RENDERING_OPTIONS is very useful.


 You can use the following Python script to automate fetching the neatline
 and launching gdalwarp to use it :

 

 from osgeo import gdal
 import os
 import sys

 ds = gdal.Open(sys.argv[1])
 neatline_wkt = ds.GetMetadataItem(NEATLINE)
 ds = None

 f = open('cutline.csv', 'wt')
 f.write('id,WKT\n')
 f.write('1,%s\n' % neatline_wkt)
 f.close()

 os.system('gdalwarp %s %s.tif ' % (sys.argv[1], sys.argv[1]) +
   '-crop_to_cutline -cutline cutline.csv -overwrite')

 

This is great.  I've added it to the wiki,
http://trac.osgeo.org/gdal/wiki/USGS_PDF_Topo


 If you're interested in only the raster part, let's imagine that the above
 script is called cutline.py, you can try the following :

 export GDAL_PDF_RENDERING_OPTIONS=RASTER
 (or set GDAL_PDF_RENDERING_OPTIONS=RASTER on windows)

 python cutline.py your.pdf

 nearblack your.pdf -o your_rgba.pdf -of GTiff -setalpha -color 0,0,0 \
 -color 255,255,255


I interpreted the above as:
nearblack your.tif -o your_rgba.tif -of GTiff -setalpha -color 0,0,0
-color 255,255,255
Where your.tif is the output from cutline.py and your_rgba.tif is the
output from nearblack.

Thanks, Eli



 Best regards,

 Even
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Neatline for USGS PDF maps

2013-04-17 Thread Even Rouault
Le mercredi 17 avril 2013 17:34:13, Eli Adam a écrit :
 I found this in drafts and it appears I failed to send it.  Sorry for
 delay.  Sent partly for the list archives at this point.
 
 On Sat, Jan 19, 2013 at 8:14 AM, Even Rouault
 
 even.roua...@mines-paris.org wrote:
  Looking more closely at those files, I see that there are various
  registration blocks. The algorithm up to now was to select the
  registration block whose neatline covered the most area in terms of
  pixels. In the case of
  OR_Newport_North_20110824_TM_geo.pdf, those blocks are :
  UTM Grid and Projection
  Orthoimage
  Map Layers
  Adjoining Quadrangles Diagram
  
  The number and names of blocks may change, but in all USGS topo PDFs
  samples I've tried, the Map Layers is always present and seems to the
  one that lead to the best results, so I've just pushed a change to
  select it when it is found.
 
 --config GDAL_PDF_NEATLINE is very helpful.  Did you find the
 registration name blocks with one of the supporting PDF libraries?  Is
 it possible to find these multiple registration name blocks from gdal?
  (I tried: gdalinfo --debug on without success.)
 

I can see them in the PDF: Description =  lines

$ gdalinfo --debug on ~/gdal/data/geopdf/OR_Newport_North_20110824_TM_geo.pdf
PDF: DPI guessed from contents stream = 600.0509320574655
PDF: OGC Encoding Best Practice style detected
PDF: LGIDict Version : 2.3
PDF: Description = UTM Grid and Projection
PDF: This is the largest neatline for now
PDF: LGIDict Version : 2.3
PDF: Description = Orthoimage
PDF: Not the largest neatline. Skipping it
PDF: LGIDict Version : 2.3
PDF: Description = Map Layers
PDF: The Map Layers registration will be selected
PDF: LGIDict Version : 2.3
PDF: Description = Adjoining Quadrangles Diagram
PDF: Not the largest neatline. Skipping it
PDF: Description = Map Layers
[...]
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Neatline for USGS PDF maps

2013-01-19 Thread Even Rouault
Le samedi 19 janvier 2013 03:38:16, Eli Adam a écrit :
 Checking over some USGS topo PDFs, the neatline reported appears too
 large.  Has anyone else check this or noticed anything similar?
 Specific details below.

Eli,

Looking more closely at those files, I see that there are various registration 
blocks. The algorithm up to now was to select the registration block whose 
neatline covered the most area in terms of pixels. In the case of 
OR_Newport_North_20110824_TM_geo.pdf, those blocks are :
UTM Grid and Projection
Orthoimage
Map Layers
Adjoining Quadrangles Diagram

The number and names of blocks may change, but in all USGS topo PDFs samples 
I've tried, the Map Layers is always present and seems to the one that lead 
to the best results, so I've just pushed a change to select it when it is 
found.

You can use the following Python script to automate fetching the neatline and 
launching gdalwarp to use it :



from osgeo import gdal
import os
import sys

ds = gdal.Open(sys.argv[1])
neatline_wkt = ds.GetMetadataItem(NEATLINE)
ds = None

f = open('cutline.csv', 'wt')
f.write('id,WKT\n')
f.write('1,%s\n' % neatline_wkt)
f.close()

os.system('gdalwarp %s %s.tif ' % (sys.argv[1], sys.argv[1]) +
  '-crop_to_cutline -cutline cutline.csv -overwrite')



Best regards,

Even
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Neatline for USGS PDF maps

2013-01-19 Thread Even Rouault
Le samedi 19 janvier 2013 16:28:53, Even Rouault a écrit :
 Le samedi 19 janvier 2013 03:38:16, Eli Adam a écrit :
  Checking over some USGS topo PDFs, the neatline reported appears too
  large.  Has anyone else check this or noticed anything similar?
  Specific details below.
 
 Eli,
 
 Looking more closely at those files, I see that there are various
 registration blocks. The algorithm up to now was to select the
 registration block whose neatline covered the most area in terms of
 pixels. In the case of
 OR_Newport_North_20110824_TM_geo.pdf, those blocks are :
 UTM Grid and Projection
 Orthoimage
 Map Layers
 Adjoining Quadrangles Diagram
 
 The number and names of blocks may change, but in all USGS topo PDFs
 samples I've tried, the Map Layers is always present and seems to the
 one that lead to the best results, so I've just pushed a change to select
 it when it is found.
 
 You can use the following Python script to automate fetching the neatline
 and launching gdalwarp to use it :
 
 
 
 from osgeo import gdal
 import os
 import sys
 
 ds = gdal.Open(sys.argv[1])
 neatline_wkt = ds.GetMetadataItem(NEATLINE)
 ds = None
 
 f = open('cutline.csv', 'wt')
 f.write('id,WKT\n')
 f.write('1,%s\n' % neatline_wkt)
 f.close()
 
 os.system('gdalwarp %s %s.tif ' % (sys.argv[1], sys.argv[1]) +
   '-crop_to_cutline -cutline cutline.csv -overwrite')
 
 

If you're interested in only the raster part, let's imagine that the above 
script is called cutline.py, you can try the following :

export GDAL_PDF_RENDERING_OPTIONS=RASTER
(or set GDAL_PDF_RENDERING_OPTIONS=RASTER on windows)

python cutline.py your.pdf

nearblack your.pdf -o your_rgba.pdf -of GTiff -setalpha -color 0,0,0 \
-color 255,255,255

 
 Best regards,
 
 Even
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] Neatline for USGS PDF maps

2013-01-18 Thread Eli Adam
Checking over some USGS topo PDFs, the neatline reported appears too
large.  Has anyone else check this or noticed anything similar?
Specific details below.


 I did the same thing and got the same result.  If you make a shapefile
 out of the neatline and view it, you will see that it matches to the
 black.  So it is a correct result but not intended.  So we need
 different values for the neatline.  Here are values that I just
 estimated off of QGIS:

 Record_Id,wkb_Polygon
 1,POLYGON ((420793 4955647,420689 4941858,410784 4942004,410971 4955792))

 Using this gives expected results.

 Does this pdf file have incorrect neatline information?  I'll look at
 some others to see if they work better.


It looks to me that the USGS topo pdf from
http://ims.er.usgs.gov/gda_services/download?item_id=5365522 reports a
neatline that covers most of the pdf, NEATLINE=POLYGON
((421614.539994676539209
4956417.675689895637333,421413.787766160559841
4941008.479600958526134,409984.382813899661414
4941157.382794972509146,410185.135042413836345
4956566.578883905895054,421614.539994676539209
4956417.675689895637333)) but should cover much less area as estimated
out of QGIS above.

Is this an error within the file or an error in what gdalinfo reports
or something else?  If it is an error in the file, I can contact the
USGS liaison for the Pacific Northwest to see if it can be fixed (at
least for Oregon).

I checked other USGS Topos in OR, CO, MI and found the same problem.
I tried some in ND and IA and the neatline seemed correct.
Specifically, 
http://ims.er.usgs.gov/gda_services/download?item_id=5154397quad=Grangerstate=IAgrid=7.5X7.5series=TNM%20GeoPDF
and 
http://ims.er.usgs.gov/gda_services/download?item_id=5251428quad=Nelson%20Lakestate=NDgrid=7.5X7.5series=TNM%20GeoPDF

It is great to have the pdf driver to make more data accessible.
GDAL/OGR always makes me smile when I encounter data in some new to me
format and it is already supported (in the last few months, SEG-Y).

Best Regards, Eli
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev