hi all,

I personally feel this about PDF 

It may have become an open format. 
However this 'open' is relevant more for the people developing software for 
and around pdf.
For a user it IS a good format for Viewing and Printing anything.
However if you want to use and further process the contents in any manner 
then it is a BAD format whether it is Scanned PDF of a non scanned pdf.
You can not easily get TEXT out from PDF as expected, Tables do not come 
out as tables.
Hence PDF is good as an endpoint but not as any intermediate point.  



On Tuesday, August 12, 2014 1:41:59 PM UTC+5:30, Chandrashekhar Raman wrote:
>
> Want to echo the points made by a few folks on the 'tyranny of pdfs'
>
> I have spent countless hours with free trial versions of pdf to excel 
> conversion software to wrangle election data from pdfs available online 
> into a usable excel format (or csv). and in the process infected my 
> computer with countless malware.Extracting and cleaning data is a learning 
> exercise in itself, but making it available in a machine readable format 
> will take this to a whole new level. 
>
> Tim Barnes Lees framework on the 'degree of open-ness' of open data is 
> perhaps relevant to this discussion.
>
>   1 Star Data available on the Web, with an open license, but non machine 
> readable - E.g. Zip files and PDFs  2 Star Data available in 
> Machine-readable proprietary formats e.g. Excel  3 Star Data available in 
> machine-readable, non-proprietary formats e.g. XML, RDF, JSON or even CSV  4 
> Star Data Available using an open linked data format (URI so people can 
> point to your data)  5 Star Data available and linked to other data ( to 
> provide context and relationships) 
>
> I am not a tech expert - but seems that there is a lot of activity around 
> moving much of data from '1star' to the '5star' level - and that could be 
> one guiding thought for us as a group.
>
> cs
>
>
>
> On Tue, Aug 12, 2014 at 1:16 PM, Ma-roof M <mahr...@gmail.com 
> <javascript:>> wrote:
>
>> I agree fully. I am as enraged by the pdfs as many others are :-)
>> What can be more enraging than staring at a scanned pdf, with tables of 
>> numbers, which one is absolutely sure came out of a spreadsheet..??
>>
>> I was hinting that if we say open standards, they could take recourse to 
>> the fact that pdf is an open standard. 
>> And at the least, ensure that whatever pdfs get shared are not image-pdfs.
>>
>> Kind Regards
>> Mahroof
>> ____________________________________________________________
>> Knowledge, that is *discovered*, lasts a lifetime..
>>  
>>
>>
>> On 12 August 2014 12:40, Vaishnavi Jayakumar (Inclusive India) <
>> vaishnavi...@inclusiveindia.info <javascript:>> wrote:
>>
>>> Pdf became an open standard, yes. But the software required to ensure 
>>> easy PDF/UA 
>>> <http://www.pdfa.org/wp-content/uploads/2013/08/PDFUA-in-a-Nutshell-PDFUA.pdf>creation
>>>  
>>> (PDF accessibility for screen readers) is invariably paid i.e. Adobe 
>>> Acrobat Pro. Else, training on the following will have to take place 
>>> http://www.w3.org/TR/WCAG20-TECHS/pdf.html which will be a mammoth 
>>> exercise.  The spectrum of PDF types (hybrid, /A, /UA etc) is also prone to 
>>> lead to confusion. The following article on document accessibility is a 
>>> good place to start thinking on Universal Design principles. 
>>> http://www.accessiq.org/news/news/2014/03/why-document-accessibility-and-pdfua-matters
>>>  *(Accessibility btw is used in disability lingo to refer to stuff 
>>> accessible to disabled people as well as non-disabled on an equal basis - 
>>> not just run of the mill accessibility)  *
>>>
>>> Ethically I am against government using paid proprietary software where 
>>> FOSS alternatives exist. 
>>> Which is why I was pleased to see UK's official adoption of the open 
>>> document format  recently. 
>>> http://www.theregister.co.uk/2014/07/23/uk_government_officially_adopts_open_document_format/
>>>
>>> Can PDFs be made accessible? Yes. 
>>> But I'd go with the sentiment expressed below via 
>>> http://datadenkers.wordpress.com/2014/06/06/the-united-nations-and-its-pdf-ghettos-supposedly-open-data-and-the-lack-of-transparency/
>>> :
>>>
>>> What is clear though is that frankly, there is no reason whatsoever, why 
>>>> ‘open data’ would ever be ‘released’ in PDF format, instead of .csv or 
>>>> .xcl. As Nathanial Manning 
>>>> <http://www.theguardian.com/global-development-professionals-network/2013/oct/21/development-open-data-action>
>>>>  puts 
>>>> it: “This is like funding James Cameron to make Avatar, and then releasing 
>>>> it in a black and white flipbook. We are missing all the good stuff.”
>>>
>>>
>>> ---------------------------------------
>>> *VAISHNAVI JAYAKUMAR*
>>> http://about.me/vjayakumar
>>>
>>>
>>> On Tue, Aug 12, 2014 at 10:33 AM, Ma-roof M <mahr...@gmail.com 
>>> <javascript:>> wrote:
>>>
>>>> for that matter, pdf is an open standard. So are office formats since 
>>>> v2007 (office open xml).
>>>>
>>>> We might want to specifically address the issue of the "image-pdf" 
>>>> menace.
>>>>
>>>> Kind Regards
>>>> Mahroof
>>>>
>>>> ____________________________________________________________
>>>> Knowledge, that is *discovered*, lasts a lifetime..
>>>>  
>>>>
>>>>
>>>> On 12 August 2014 10:18, Vaishnavi Jayakumar (Inclusive India) <
>>>> vaishnavi...@inclusiveindia.info <javascript:>> wrote:
>>>>
>>>>> Are there no guidelines in existence for what formats need to be used 
>>>>> for data / information sharing? If data is open, isn't it necessarily in 
>>>>> open formats?
>>>>>
>>>>> Echo that the letter needs to be sent. But Datameet should next 
>>>>> prioritise advocating for standard open formats for information sharing 
>>>>> at 
>>>>> the policy level.
>>>>>
>>>>> In my work on disability advocacy, most of the time the bills put up 
>>>>> asking for public comments are pdfs created by crooked scans of 
>>>>> printouts. 
>>>>> It needs to be transcribed before people with near vision / blindness can 
>>>>> read it. Such a waste. 
>>>>>
>>>>> Nirmita at CIS should be able to advise on universal open formats that 
>>>>> would be equally accessible by disabled people too.
>>>>>
>>>>> ---------------------------------------
>>>>> *VAISHNAVI JAYAKUMAR*
>>>>> http://about.me/vjayakumar
>>>>>
>>>>>
>>>>> On Tue, Aug 12, 2014 at 8:14 AM, Nisha Thompson <ni...@datameet.org 
>>>>> <javascript:>> wrote:
>>>>>
>>>>>> That's a really good question. 
>>>>>>
>>>>>> I can see them making a point that they comply because a great deal 
>>>>>> of their data is online in some fashion, just not machine readable or 
>>>>>> easy 
>>>>>> to find.  Also they don't have a data controller registered with the 
>>>>>> NIC: 
>>>>>> http://data.gov.in/datacontrollers
>>>>>>
>>>>>> It is unclear to me how strong an NDSAP request can be, as it doesn't 
>>>>>> hold the weight an RTI request does.  My vote is we shouldn't and make a 
>>>>>> separate request to the data.gov.in NIC team to meet with them.
>>>>>>  
>>>>>> Nisha
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Aug 11, 2014 at 8:02 PM, Rajesh D Hanbal <raje...@gmail.com 
>>>>>> <javascript:>> wrote:
>>>>>>
>>>>>>> Nisha, 
>>>>>>>
>>>>>>> How does our request fit with the NDSAP? Does it deal only with Dept 
>>>>>>> of S & T or does it apply to all departments? 
>>>>>>>
>>>>>>> http://www.dst.gov.in/nsdi.html : 
>>>>>>>
>>>>>>> "Global experience has demonstrated convincingly that access to 
>>>>>>> data leads to breakthroughs in scientific understanding as well as to 
>>>>>>> economic and public good, in addition to several benefits to civil 
>>>>>>> society.   *Given the deployment of substantial level of investment 
>>>>>>> of public funds in collection of data and the untapped potentials of 
>>>>>>> benefits to social society, it has become important to make available 
>>>>>>> non-sensitive data for legitimate and registered use."*
>>>>>>>
>>>>>>> If it is relevant, we could make a reference to it. Otherwise, the 
>>>>>>> letter looks fine to me. 
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Rajesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 11, 2014 at 5:48 PM, Nisha Thompson <ni...@datameet.org 
>>>>>>> <javascript:>> wrote:
>>>>>>>
>>>>>>>> Hey All,
>>>>>>>>
>>>>>>>> Even though it has been awhile I think it is still important we 
>>>>>>>> send this letter.
>>>>>>>>
>>>>>>>> Last round of feedback and then we'll send it out!
>>>>>>>>
>>>>>>>> http://datameet.org/wiki/odclettertoecidraft
>>>>>>>>
>>>>>>>> Nisha
>>>>>>>> -- 
>>>>>>>> Nisha Thompson
>>>>>>>> DataMeet.org
>>>>>>>> ni...@datameet.org <javascript:>
>>>>>>>> skype: nishaqt
>>>>>>>> mobile: 962-061-2245
>>>>>>>>  
>>>>>>>> -- 
>>>>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>>>>> more about us by visiting http://datameet.org
>>>>>>>> --- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "datameet" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to datameet+u...@googlegroups.com <javascript:>.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>  -- 
>>>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>>>> more about us by visiting http://datameet.org
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "datameet" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to datameet+u...@googlegroups.com <javascript:>.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Nisha Thompson
>>>>>> DataMeet.org
>>>>>> ni...@datameet.org <javascript:>
>>>>>> skype: nishaqt
>>>>>> mobile: 962-061-2245
>>>>>>  
>>>>>> -- 
>>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>>> more about us by visiting http://datameet.org
>>>>>> --- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "datameet" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to datameet+u...@googlegroups.com <javascript:>.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>  -- 
>>>>> Datameet is a community of Data Science enthusiasts in India. Know 
>>>>> more about us by visiting http://datameet.org
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "datameet" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to datameet+u...@googlegroups.com <javascript:>.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>>> about us by visiting http://datameet.org
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "datameet" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to datameet+u...@googlegroups.com <javascript:>.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  -- 
>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>> about us by visiting http://datameet.org
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to datameet+u...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> Datameet is a community of Data Science enthusiasts in India. Know more 
>> about us by visiting http://datameet.org
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to datameet+u...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to