Hello all,

It was nice to meet you last week!!!

I am writing genomic pCollection that is created from bigQuery to a folder.
Following is the code with output so you can run it with any small BQ table
and let me know what your thoughts are:

rows = [{u'index': u'GSM2313641', u'SNRPCP14': 0},{u'index': u'GSM2316666',
u'SNRPCP14': 0},{u'index': u'GSM2312355', u'SNRPCP14': 0},{u'index':
u'GSM2312372', u'SNRPCP14': 0}]

rows[1].keys()
# output:  [u'index', u'SNRPCP14']

# you can change `archs4.results_20180308_ to any other table name with
index column
queries2 = rows | beam.Map(lambda x:
(beam.io.Read(beam.io.BigQuerySource(project='orielresearch-188115',
use_standard_sql=False, query=str('SELECT * FROM
`archs4.results_20180308_*` where index=\'%s\'' % (x["index"])))),
                               str('gs://archs4/output/'+x["index"]+'/')))

queries2
# output: a list of pCollection and the path to write the pCollection data
to

[(<Read(PTransform) label=[Read] at 0x7fa6990fb7d0>,
  'gs://archs4/output/GSM2313641/'),
 (<Read(PTransform) label=[Read] at 0x7fa6990fb950>,
  'gs://archs4/output/GSM2316666/'),
 (<Read(PTransform) label=[Read] at 0x7fa6990fb9d0>,
  'gs://archs4/output/GSM2312355/'),
 (<Read(PTransform) label=[Read] at 0x7fa6990fbb50>,
  'gs://archs4/output/GSM2312372/')]


*# this is my challenge*
queries2 | 'write to relevant path' >> beam.io.WriteToText("SECOND COLUMN")

Do you have any idea how to sink the data to a text file? I have tried few
other options and was stuck at the write transform

Any advice is very appreciated.

Thanks,
Eila



-- 
Eila
www.orielresearch.org
https://www.meetup.com/Deep-Learning-In-Production/

Reply via email to