Hey Etienne,

You've correctly enumerated a few ways to transfer data from Cloud SQL to 
BigQuery: 

* export to CSV and load the CSV into BigQuery
* retrieve the data with an app and stream it into the BigQuery API
* export to Datastore and then import to BigQuery  

There are also other ways, such as using a mysqldump of your SQL DB 
<https://cloud.google.com/bigquery/docs/loading-data-sql-dml>. You should 
check the BigQuery "Loading Data" 
<https://cloud.google.com/bigquery/loading-data> documentation.

Let me know if you have any further questions, and I'll be happy to assist.

Cheers,

Nick
Cloud Platform Community Support

On Tuesday, February 14, 2017 at 9:40:44 AM UTC-5, Etienne B. Roesch wrote:
>
> Hi,
>
> Sorry for the repeat, but I am trying to wrap my head around the 
> GAE-osphere and I am getting a bit confused;
>
> I need to store and retrieve/analyse timeseries data, of varying sizes and 
> resolutions; at the moment, the data is received and stored on GAE through 
> to Google Cloud SQL (python). That's not ideal. I foresee I will have to do 
> more analytics than data storage for the sake of storage, and I predict a 
> big throughput of data generally, and have thus been looking at 
> BigQuery/Datalab. I don't seem to find an obvious way to load data in 
> BigQuery from Cloud SQL, and would either have to export the data to Cloud 
> Storage in csv-ish, or directly stream the data from the GAE app to 
> BigQuery (which is currently my preferred option).
> Alternatively, there is the option of passing through to Google Datastore 
> first, which for me might be a more flexible way of preprocessing the data 
> before it enters in BigQuery.
>
> Is this the way to do things, or am I missing something?
>
> Thanks!
>
> Etienne
>
>
> On Tuesday, 13 August 2013 13:59:52 UTC+1, Martin Trummer wrote:
>>
>> I'm a newbie to the AppEngine datastore and like to know how to best 
>> design this use case:
>> there may be some time-series with huge amount of data: e.g. terra-bytes 
>> for one time-series
>> the transacations doc 
>> <https://developers.google.com/appengine/docs/java/datastore/transactions> 
>> says about entity groups:
>>
>>    - *"Every entity belongs to an entity group, a set of one or more 
>>    entities that can be manipulated in a single transaction."*
>>    - *"every entity with a given root entity as an ancestor is in the 
>>    same entity group. All entities in a group are stored in the same 
>> Datastore 
>>    node."*
>>
>> so does that mean, that all the terra-bytes of data for the huge 
>> time-series would end up *on one computer* somewhere in the AppEngine 
>> network?
>> if so: 
>>
>>    - that's not a good idea, right?
>>    - how to avoid it? should I split up the data in sections (e.g. per 
>>    month) where each section has it's own kind/entity group?
>>    
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/71743636-3b96-44e7-a887-ed308d532b69%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to