[Tutor] Parse files and create sqlite3 db

2014-12-03 Thread jarod...@libero.it
Dear all
I aattach the file I want to parse and I wanto to create a sqlite database.
The problem is I'm not able to create in the right way because there is some 
logic problems on my script. I have  this code:


import sqlite
import sqlite3
import os

# Creates or opens a file called mydb with a SQLite3 DB
db = sqlite3.connect('Fastqcd_completo.db')
cursor = db.cursor()
create_table_sql = 
create table fastqc_summary (
fileid varchar,
module varchar,
status varchar,
total int,
duplicate varchar
);
cursor.execute(create_table_sql)
create_table_sql2=
create table fastqc_details (
id serial primary key,
fileid varchar,
module varchar,
col1 varchar,
ocols varchar
);

cursor.execute(create_table_sql2)
db.commit()

for root, dirs, files in 
os.walk(/home/mauro/Desktop/LAVORO_CRO/2014/Statitica_RNAseqalign/FASTQC_completo/fastqcdecembre/):
 # walk a r
for name in files:
if (name == fastqc_data.txt):
fileid = name # use string slicing here if you only want part of the
 
with open(os.path.join(root,name),r) as p: # automatically close 
the file when done
for i in p:
line =i.strip()

if Filename in line:
fileid = line.split()[1]
if Total Sequence in line:
total = line.split()[2]

if Total Duplicate in line:
dup = line.split()[3]

if (line[:2] ==  and line[:12] != END_MODULE):
module = line[2:-5] # grab module name
status = line[-4:] # and overall status pass/warn/fail
sql = insert into 
fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?);
data = (fileid,module,status,total,dup)
cursor.execute(sql,data)
elif (line[:2] !=  and line[:2] != ##): # grab 
details under each module
cols = line.split(\t)
col1 = cols[0]
ocols = |.join(cols[1:])
sql = insert into 
fastqc_details(fileid,module,col1,ocols) values(?,?,?,?);
data = (fileid,module,col1,ocols)
cursor.execute(sql,data)
db.commit()


So the problem is how to excatct only some parts of the files. In red are the 
point of the problems. The say the Filename are not defined On the file atached 
I want to take this part and use for create the database:
##FastQC0.10.1
Basic Statisticspass
#MeasureValue
FilenameR05_CTTGTA_L004_R1_001.fastq.gz
File typeConventional base calls
EncodingSanger / Illumina 1.9
Total Sequences27868496
Filtered Sequences0
Sequence length50
%GC50
END_MODULE
How can I resolve this problem? I need to use the number of rows?
bw,



##FastQC0.10.1
Basic Statistics  pass
#MeasureValue   
FilenameR05_CTTGTA_L004_R1_001.fastq.gz 
File type   Conventional base calls 
EncodingSanger / Illumina 1.9   
Total Sequences 27868496
Filtered Sequences  0   
Sequence length 50  
%GC 50  
END_MODULE
Per base sequence quality pass
#Base   MeanMedian  Lower Quartile  Upper Quartile  10th Percentile 90th 
Percentile
1   32.90529076273079   34.031.034.031.034.0
2   33.100253167591106  34.033.034.031.034.0
3   33.219579520904176  34.034.034.031.034.0
4   36.53049856009452   37.037.037.035.037.0
5   36.441143289540996  37.037.037.035.037.0
6   36.400846174117184  37.037.037.035.037.0
7   36.38746253116781   37.037.037.035.037.0
8   36.36443940139432   37.037.037.035.037.0
9   38.22529985112939   39.039.039.037.039.0
10  38.215059327205886  39.039.039.037.039.0
11  38.252484848841505  39.039.039.037.039.0
12  38.22672156401982   39.039.039.037.039.0
13  38.164828234720666  39.039.039.037.039.0
14  39.76166327741547   41.040.041.038.041.0
15  39.765876098947 41.040.041.038.041.0
16  39.74199730764086   41.040.041.037.041.0
17  39.07676352538006   41.040.041.036.041.0
18  39.42998348385934   41.039.041.036.041.0
19  39.57992178695255   41.040.041.037.041.0
20  39.438335638923604  41.040.041.037.041.0
21  39.54068827395637   41.040.041.037.041.0
22  39.464387565084245  41.040.041.037.041.0
23  39.227799196626904   

Re: [Tutor] Parse files and create sqlite3 db

2014-12-03 Thread Alan Gauld

On 03/12/14 09:36, jarod...@libero.it wrote:

Dear all
I aattach the file I want to parse and I wanto to create a sqlite database.
The problem is I'm not able to create in the right way because there is
some logic problems on my script. I have  this code:


Do you get an error message? If so please send a cut n paste of the full 
error text.



import sqlite
import sqlite3


You only need the second of these imports


import os

# Creates or opens a file called mydb with a SQLite3 DB
db = sqlite3.connect('Fastqcd_completo.db')
cursor = db.cursor()
create_table_sql = 
create table fastqc_summary (
fileid varchar,
module varchar,
status varchar,
total int,
duplicate varchar
);

 cursor.execute(create_table_sql)

This looks fine except its normal to have a unique key somewhere.
But if you are confident that you will have uniqueness in your
data you don't strictly need it.


create_table_sql2=
create table fastqc_details (
id serial primary key,


But this is odd. I can't find anything about a serial keyword for 
SQLite. The normal format of this would be


id INTEGER PRIMARY KEY

which makes id an auto-incrementing unique value.


fileid varchar,
module varchar,
col1 varchar,
ocols varchar
);

cursor.execute(create_table_sql2)
db.commit()


The other potential issue is that you are creating these tables each 
time you run the script. But if you run the script a second time the 
tables willalready exist and the CREATES will fail. You should either 
drop the tables at the top of the script or use the


CREATE TABLE IF NOT EXISTS table ...

format of create. Then if the tab;le already exists your data will
be appended. (If you want it overwritten use the drop table technique)


for root, dirs, files in
os.walk(/home/mauro/Desktop/LAVORO_CRO/2014/Statitica_RNAseqalign/FASTQC_completo/fastqcdecembre/):
# walk a r
 for name in files:
 if (name == fastqc_data.txt):


You are searching for a specific name. Could there be multiple such 
files or are you just using wa;lk to find one? If the latter it would be 
better to exit the os.walk loop once you find it and then process the 
file. You need to store the root and filename first of course.



 fileid = name # use string slicing here if you only want
part of the

with open(os.path.join(root,name),r) as p: # automatically close
the file when done
 for i in p:
 line =i.strip()

 if Filename in line:
 fileid = line.split()[1]
 if Total Sequence in line:
 total = line.split()[2]

 if Total Duplicate in line:
 dup = line.split()[3]

 if (line[:2] ==  and line[:12] != END_MODULE):
 module = line[2:-5] # grab module name
 status = line[-4:] # and overall status
pass/warn/fail
 sql = insert into
fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?);
 data = (fileid,module,status,total,dup)
 cursor.execute(sql,data)
 elif (line[:2] !=  and line[:2] != ##): #
grab details under each module
 cols = line.split(\t)
 col1 = cols[0]
 ocols = |.join(cols[1:])
 sql = insert into
fastqc_details(fileid,module,col1,ocols) values(?,?,?,?);
 data = (fileid,module,col1,ocols)
 cursor.execute(sql,data)
db.commit()


So the problem is how to excatct only some parts of the files. In red
are the point of the problems. The say the Filename are not defined


What says that? It's true that you don't have a variable Filename 
defined. But I don't see Filename in your code anywhere either. So I 
don't understand the error message. Can you post it in full please?



the file atached I want to take this part and use for create the database:
##FastQC0.10.1
 Basic Statisticspass
#MeasureValue
FilenameR05_CTTGTA_L004_R1_001.fastq.gz
File typeConventional base calls
EncodingSanger / Illumina 1.9
Total Sequences27868496
Filtered Sequences0
Sequence length50
%GC50
 END_MODULE



--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Does the user need to install Python, when we deploy our c++ products using python?

2014-12-03 Thread gordon zhang
I downloaded python 3.4.2 for c++ and create a vc++ project using python,
but I have no idea what python dlls and other stuff needed to deploy the
products.

I know if we put Python34.dll and Python.dll in the folder of executable,
it is not enough. What else do we need to put in the folder of
executable?(the application does not run)

If anyone knows please let me know.

Thanks, Gordon
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Does the user need to install Python, when we deploy our c++ products using python?

2014-12-03 Thread Alan Gauld

On 02/12/14 20:28, gordon zhang wrote:


I downloaded python 3.4.2 for c++ and create a vc++ project using
python, but I have no idea what python dlls and other stuff needed to
deploy the products.


Python is an interpreter so you will need the full Python interpreter 
and all the module files that you use. Some of these are DLLs others are 
Python scripts.



I know if we put Python34.dll and Python.dll in the folder of
executable, it is not enough. What else do we need to put in the folder
of executable?(the application does not run)


Using Python just to build an installer script where the users don't 
already have Python installed sounds like a bad idea to me. Especially 
on Windows(which given the mention of VC++ seems to be the platform). 
There are plenty of installer tools for Windows that would be more 
suitable and some of them are even free.


If Python was already installed on your target PCs it would be a 
reasonable decision but, if not, you effectively need two installers - 
one for Python and another for your app.


There are tools such as freeze or py2exe that can bundle up a Python 
script as a single exe file but that means installing the interpreter as 
part of the exe. You could of course delete the python exe bundle after 
the install completed but it still seems overkill for an install script.



--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Does the user need to install Python, when we deploy our c++ products using python?

2014-12-03 Thread Steven D'Aprano
On Tue, Dec 02, 2014 at 01:28:36PM -0700, gordon zhang wrote:
 I downloaded python 3.4.2 for c++ and create a vc++ project using python,
 but I have no idea what python dlls and other stuff needed to deploy the
 products.
 
 I know if we put Python34.dll and Python.dll in the folder of executable,
 it is not enough. What else do we need to put in the folder of
 executable?(the application does not run)

I'm afraid I have very little idea what python 3.4.2 for c++ is, or 
what a vc++ project using python means. Do you mean Microsoft Visual 
C++?

Where did you download this from? How do you use it in C++? C++ and 
Python are two very different languages. Normally people just use Python 
alone, or C++ alone.

This mailing list is for beginners wanting to learn Python, and we may 
not have the experience needed to answer advanced questions like 
building C++ applications with embedded Python. For that, I recommend 
you ask on the comp.lang.python newsgroup, or on the forums of whatever 
embedded python software you are using.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A small project

2014-12-03 Thread Peter Otten
Santosh Kumar wrote:

 All,
 
 Any suggestion to my question ?
 
 On Thu, Nov 27, 2014 at 12:43 PM, Santosh Kumar rhce@gmail.com
 wrote:
 
 Hi All,

 I am planning to start a small project , so i need some suggestions on
 how to go about it.

 1) It basically takes some inputs like (name,age,course,joining
 date,remarks) so on.
 2) I want a light front end for it.

If you use a web browser you may be able evade the deployment hassles. 
Here's a simple example:

http://bottlepy.org/docs/dev/tutorial_app.html

 3) i should be able to query for a particular person on a particular date
 or a joining date.
 4) The app should be easily deployable both in the linux and windows
 machines. ( something like a .exe in windows)

 So this is my requirement as stated in the above four points. Now i need
 suggestions on how to go ahead with these.

 for 2) should i go for tkinter or do we have something even lighter.
 for 3) to achieve this should i just go for a database, if yes i need
 something which can be easily moved across.
 for 4) if i have to create a executable , how to make sure all the above
 requirements can be packaged into a application or a software.

 Thanks in advance.
 --
 D. Santosh Kumar


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Parse files and create sqlite3 db (Alan Gauld)

2014-12-03 Thread jarod...@libero.it
thanks so much,here you have the error:
---
NameError Traceback (most recent call last)
ipython-input-16-64f0293cca64 in module()
 43 status = line[-4:] # 
and overall status pass/warn/fail
 44 sql = insert into 
fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?);
--- 45 data = (fileid,module,
status,total,dup)
 46 cursor.execute(sql,
data)
 47 elif (line[:2] !=  and line
[:2] != ##): # grab details under each module

NameError: name 'total' is not defined
The problem It is I need to write only if exist that names and values.
So from the original file each time I have this rows:
##FastQC0.10.1
Basic Statistics  pass
#MeasureValue   
FilenameR05_CTTGTA_L004_R1_001.fastq.gz 
File type   Conventional base calls 
EncodingSanger / Illumina 1.9   
Total Sequences 27868496
Filtered Sequences  0   
Sequence length 50  
%GC 50  
END_MODULE
So they need to be defined. So know I need to do: if  Total and if Filename 
and Total then do the script?



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Parse files and create sqlite3 db (Alan Gauld)

2014-12-03 Thread Dave Angel

On 12/03/2014 08:07 AM, jarod...@libero.it wrote:

thanks so much,here you have the error:
---
NameError Traceback (most recent call last)
ipython-input-16-64f0293cca64 in module()
  43 status = line[-4:] #
and overall status pass/warn/fail
  44 sql = insert into
fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?);
--- 45 data = (fileid,module,
status,total,dup)
  46 cursor.execute(sql,
data)
  47 elif (line[:2] !=  and line
[:2] != ##): # grab details under each module

NameError: name 'total' is not defined


That's because you defined it only inside an if statement.  So if that 
condition is false, total is NOT a variable.  Perhaps you just want to 
give it a default value. I'd tend to do that in an else clause:


if Total Sequence in line:
total = line.split()[2]
else:
total = 

Similarly for fileid and dup.

Now, you can test those values, or use their defaults directly, in 
whatever other place you need.


While I've got you can I request a few things to make the forum go smoother:

1) use text not html messages.  You did here, but in your original 
message you mentioned something about red and that doesn't show up 
everywhere.  This is a text forum.


2) Use reply-list or reply-all, or whatever your email program will 
manage.  Whenever you just compose a new message you start a new thread, 
and that just breaks the flow.  Many newsreaders don't handle split 
threads, and those that do still cause pain for the user.


3) Don't use attachments.  Just paste it in with everything else (in a 
text email, of course, as html messes up indentation about 50% of the 
time.)  Many environments don't even see the attachment.


--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] How to split string into separate lines

2014-12-03 Thread shweta kaushik
Hi,

I need help for doing this task. I know it will be simple but I am not able
to do it.

output_message_packet=
fe01b8412756fe02fe01b9416239fe02fe01ba41ad88fe02fe01bb41e8e7fe02fe01bc4112fbfe02fe01bd415794fe02

I want to split this into separate message packets like this:
fe01b8412756fe02
fe01b9416239fe02
fe01ba41ad88fe02
fe01bb41e8e7fe02
fe01bc4112fbfe02
fe01bd415794fe02

After this I want to split this into bytes:
fe 01 b8 41 00 00 00 00 00 00 00 00 27 56 fe 02

Please help me with code for how to do this.

Thanks in advance.

Regards,
Shweta
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Python

2014-12-03 Thread jack jasper
Why do I get SyntaxErrors when I type in an article like a or a colon :
Enter an integer an gets listed as a SyntaxError

x == 0: The colon gets flagged This is part of an if else program with  an
x = raw_input('Enter an integer:  ') sort of statement above it. 

I've had trouble with - with the  being flagged. 

I just don't get it.

Thank you,

Jack Jasper
jackjas...@mac.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to split string into separate lines

2014-12-03 Thread Joel Goldstick
On Wed, Dec 3, 2014 at 1:18 PM, shweta kaushik coolshw...@gmail.com wrote:
 Hi,

 I need help for doing this task. I know it will be simple but I am not able
 to do it.

 output_message_packet=
 fe01b8412756fe02fe01b9416239fe02fe01ba41ad88fe02fe01bb41e8e7fe02fe01bc4112fbfe02fe01bd415794fe02

 I want to split this into separate message packets like this:
 fe01b8412756fe02
 fe01b9416239fe02
 fe01ba41ad88fe02
 fe01bb41e8e7fe02
 fe01bc4112fbfe02
 fe01bd415794fe02

 After this I want to split this into bytes:
 fe 01 b8 41 00 00 00 00 00 00 00 00 27 56 fe 02

 Please help me with code for how to do this.

 Thanks in advance.

 Regards,
 Shweta


Look up python string methods to start.

Does each line start with fe01?
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor




-- 
Joel Goldstick
http://joelgoldstick.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python

2014-12-03 Thread Joel Goldstick
On Wed, Dec 3, 2014 at 4:02 PM, jack jasper jackjas...@me.com wrote:
 Why do I get SyntaxErrors when I type in an article like a or a colon :
 Enter an integer an gets listed as a SyntaxError

 x == 0: The colon gets flagged This is part of an if else program with  an

colon only goes on end of def, if, elif, else, and for
 statements
 x = raw_input('Enter an integer:  ') sort of statement above it.

 I've had trouble with - with the  being flagged.

 I just don't get it.

show your code, give version number (2.x or 3.x).  I'm guessing 2.x
since you are using raw_input

 Thank you,

 Jack Jasper
 jackjas...@mac.com
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor



-- 
Joel Goldstick
http://joelgoldstick.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to split string into separate lines

2014-12-03 Thread Alan Gauld

On 03/12/14 18:18, shweta kaushik wrote:


I need help for doing this task. I know it will be simple but I am not
able to do it.


Your description of the task is not very precise so I'll
make some guesses below.


output_message_packet= fe01b8412756fe02fe01b9410

I want to split this into separate message packets like this:
fe01b8412756fe02
fe01b9416239fe02


It looks like the packet terminator is either fe02 or a fixed length 
record. It's not clear whether the fixed length is accidental or 
deliberate or whether its the terminator that is coincidentally the 
same. I'm guessing its probably based on terminator...


To split by 'fe02' just use string split then add the separator
back on:

sep = 'fe02'
packets = [pkt+sep for pkt in data.split(sep)]


If it is based on length it will look something like:

data = []
while s:
   data.append(s[:length])
   s = s[length:]


After this I want to split this into bytes:
fe 01 b8 41 00 00 00 00 00 00 00 00 27 56 fe 02


This is the same split by length problem applied to
each packet with length 2.

In fact if it is split by length in both you could write
a helper function:

def split_by_length(s,length):
data = []
while s:
data.append(s[:length])
s = s[length:]
return data

And call it with

packets = split_by_length(inputData, packetLength)
bytes = [split_by_length(p,2) for p in packets]


HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python

2014-12-03 Thread Alan Gauld

On 03/12/14 21:02, jack jasper wrote:

Why do I get SyntaxErrors when I type in an article like a or a colon :
Enter an integer an gets listed as a SyntaxError

x == 0: The colon gets flagged This is part of an if else program with  an
x = raw_input('Enter an integer:  ') sort of statement above it.



That doesn't help, just post real code that generates the error.
And tell us which Python version and OS you are using to run it.


I've had trouble with - with the  being flagged.


I have no idea what you mean by that.
When do you ever use - in Python?


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor