subject:"typical JSON data sets"

Re: typical JSON data sets

2013-07-04 Thread Lenin Raj

Hi John,

I have just started pulling Twitter conversions using Apache Flume. But I
have not started processing the pulled data yet. And my answers below:

1)  How large is each JSON document?

Averages from 100 KB to 2 MB. Flume rolls a new file every 1 minutes (which
is configurable). So the size depends on the number of events happened
during that interval

2)  Do they tend to be a single JSON doc per file, or multiples per
file?

Multiples per file - The max file (3.2 MB) had about 1100 JSON docs

3)  Do the JSON schemas change over time?

Nope. Since its the standard Twitter API
4)  Are there interesting public data sets you would recommend for
experiment?

Twitter API

Thanks,
Lenin

On Tue, Jul 2, 2013 at 9:34 PM, John Lilley wrote:

>  I would like to hear your experiences working with large JSON data sets,
> specifically:
>
> **1)  **How large is each JSON document?
>
> **2)  **Do they tend to be a single JSON doc per file, or multiples
> per file?
>
> **3)  **Do the JSON schemas change over time?
>
> **4)  **Are there interesting public data sets you would recommend
> for experiment?
>
> Thanks
>
> John
>
> ** **
>

typical JSON data sets

2013-07-02 Thread John Lilley

I would like to hear your experiences working with large JSON data sets, 
specifically:

1)  How large is each JSON document?

2)  Do they tend to be a single JSON doc per file, or multiples per file?

3)  Do the JSON schemas change over time?

4)  Are there interesting public data sets you would recommend for 
experiment?
Thanks
John

Re: typical JSON data sets

typical JSON data sets

2 matches

Site Navigation

Mail list logo

Footer information