[Tutor] Parse files and create sqlite3 db
Dear all I aattach the file I want to parse and I wanto to create a sqlite database. The problem is I'm not able to create in the right way because there is some logic problems on my script. I have this code: import sqlite import sqlite3 import os # Creates or opens a file called mydb with a SQLite3 DB db = sqlite3.connect('Fastqcd_completo.db') cursor = db.cursor() create_table_sql = create table fastqc_summary ( fileid varchar, module varchar, status varchar, total int, duplicate varchar ); cursor.execute(create_table_sql) create_table_sql2= create table fastqc_details ( id serial primary key, fileid varchar, module varchar, col1 varchar, ocols varchar ); cursor.execute(create_table_sql2) db.commit() for root, dirs, files in os.walk(/home/mauro/Desktop/LAVORO_CRO/2014/Statitica_RNAseqalign/FASTQC_completo/fastqcdecembre/): # walk a r for name in files: if (name == fastqc_data.txt): fileid = name # use string slicing here if you only want part of the with open(os.path.join(root,name),r) as p: # automatically close the file when done for i in p: line =i.strip() if Filename in line: fileid = line.split()[1] if Total Sequence in line: total = line.split()[2] if Total Duplicate in line: dup = line.split()[3] if (line[:2] == and line[:12] != END_MODULE): module = line[2:-5] # grab module name status = line[-4:] # and overall status pass/warn/fail sql = insert into fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?); data = (fileid,module,status,total,dup) cursor.execute(sql,data) elif (line[:2] != and line[:2] != ##): # grab details under each module cols = line.split(\t) col1 = cols[0] ocols = |.join(cols[1:]) sql = insert into fastqc_details(fileid,module,col1,ocols) values(?,?,?,?); data = (fileid,module,col1,ocols) cursor.execute(sql,data) db.commit() So the problem is how to excatct only some parts of the files. In red are the point of the problems. The say the Filename are not defined On the file atached I want to take this part and use for create the database: ##FastQC0.10.1 Basic Statisticspass #MeasureValue FilenameR05_CTTGTA_L004_R1_001.fastq.gz File typeConventional base calls EncodingSanger / Illumina 1.9 Total Sequences27868496 Filtered Sequences0 Sequence length50 %GC50 END_MODULE How can I resolve this problem? I need to use the number of rows? bw, ##FastQC0.10.1 Basic Statistics pass #MeasureValue FilenameR05_CTTGTA_L004_R1_001.fastq.gz File type Conventional base calls EncodingSanger / Illumina 1.9 Total Sequences 27868496 Filtered Sequences 0 Sequence length 50 %GC 50 END_MODULE Per base sequence quality pass #Base MeanMedian Lower Quartile Upper Quartile 10th Percentile 90th Percentile 1 32.90529076273079 34.031.034.031.034.0 2 33.100253167591106 34.033.034.031.034.0 3 33.219579520904176 34.034.034.031.034.0 4 36.53049856009452 37.037.037.035.037.0 5 36.441143289540996 37.037.037.035.037.0 6 36.400846174117184 37.037.037.035.037.0 7 36.38746253116781 37.037.037.035.037.0 8 36.36443940139432 37.037.037.035.037.0 9 38.22529985112939 39.039.039.037.039.0 10 38.215059327205886 39.039.039.037.039.0 11 38.252484848841505 39.039.039.037.039.0 12 38.22672156401982 39.039.039.037.039.0 13 38.164828234720666 39.039.039.037.039.0 14 39.76166327741547 41.040.041.038.041.0 15 39.765876098947 41.040.041.038.041.0 16 39.74199730764086 41.040.041.037.041.0 17 39.07676352538006 41.040.041.036.041.0 18 39.42998348385934 41.039.041.036.041.0 19 39.57992178695255 41.040.041.037.041.0 20 39.438335638923604 41.040.041.037.041.0 21 39.54068827395637 41.040.041.037.041.0 22 39.464387565084245 41.040.041.037.041.0 23 39.227799196626904
Re: [Tutor] Parse files and create sqlite3 db
On 03/12/14 09:36, jarod...@libero.it wrote: Dear all I aattach the file I want to parse and I wanto to create a sqlite database. The problem is I'm not able to create in the right way because there is some logic problems on my script. I have this code: Do you get an error message? If so please send a cut n paste of the full error text. import sqlite import sqlite3 You only need the second of these imports import os # Creates or opens a file called mydb with a SQLite3 DB db = sqlite3.connect('Fastqcd_completo.db') cursor = db.cursor() create_table_sql = create table fastqc_summary ( fileid varchar, module varchar, status varchar, total int, duplicate varchar ); cursor.execute(create_table_sql) This looks fine except its normal to have a unique key somewhere. But if you are confident that you will have uniqueness in your data you don't strictly need it. create_table_sql2= create table fastqc_details ( id serial primary key, But this is odd. I can't find anything about a serial keyword for SQLite. The normal format of this would be id INTEGER PRIMARY KEY which makes id an auto-incrementing unique value. fileid varchar, module varchar, col1 varchar, ocols varchar ); cursor.execute(create_table_sql2) db.commit() The other potential issue is that you are creating these tables each time you run the script. But if you run the script a second time the tables willalready exist and the CREATES will fail. You should either drop the tables at the top of the script or use the CREATE TABLE IF NOT EXISTS table ... format of create. Then if the tab;le already exists your data will be appended. (If you want it overwritten use the drop table technique) for root, dirs, files in os.walk(/home/mauro/Desktop/LAVORO_CRO/2014/Statitica_RNAseqalign/FASTQC_completo/fastqcdecembre/): # walk a r for name in files: if (name == fastqc_data.txt): You are searching for a specific name. Could there be multiple such files or are you just using wa;lk to find one? If the latter it would be better to exit the os.walk loop once you find it and then process the file. You need to store the root and filename first of course. fileid = name # use string slicing here if you only want part of the with open(os.path.join(root,name),r) as p: # automatically close the file when done for i in p: line =i.strip() if Filename in line: fileid = line.split()[1] if Total Sequence in line: total = line.split()[2] if Total Duplicate in line: dup = line.split()[3] if (line[:2] == and line[:12] != END_MODULE): module = line[2:-5] # grab module name status = line[-4:] # and overall status pass/warn/fail sql = insert into fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?); data = (fileid,module,status,total,dup) cursor.execute(sql,data) elif (line[:2] != and line[:2] != ##): # grab details under each module cols = line.split(\t) col1 = cols[0] ocols = |.join(cols[1:]) sql = insert into fastqc_details(fileid,module,col1,ocols) values(?,?,?,?); data = (fileid,module,col1,ocols) cursor.execute(sql,data) db.commit() So the problem is how to excatct only some parts of the files. In red are the point of the problems. The say the Filename are not defined What says that? It's true that you don't have a variable Filename defined. But I don't see Filename in your code anywhere either. So I don't understand the error message. Can you post it in full please? the file atached I want to take this part and use for create the database: ##FastQC0.10.1 Basic Statisticspass #MeasureValue FilenameR05_CTTGTA_L004_R1_001.fastq.gz File typeConventional base calls EncodingSanger / Illumina 1.9 Total Sequences27868496 Filtered Sequences0 Sequence length50 %GC50 END_MODULE -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Does the user need to install Python, when we deploy our c++ products using python?
I downloaded python 3.4.2 for c++ and create a vc++ project using python, but I have no idea what python dlls and other stuff needed to deploy the products. I know if we put Python34.dll and Python.dll in the folder of executable, it is not enough. What else do we need to put in the folder of executable?(the application does not run) If anyone knows please let me know. Thanks, Gordon ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Does the user need to install Python, when we deploy our c++ products using python?
On 02/12/14 20:28, gordon zhang wrote: I downloaded python 3.4.2 for c++ and create a vc++ project using python, but I have no idea what python dlls and other stuff needed to deploy the products. Python is an interpreter so you will need the full Python interpreter and all the module files that you use. Some of these are DLLs others are Python scripts. I know if we put Python34.dll and Python.dll in the folder of executable, it is not enough. What else do we need to put in the folder of executable?(the application does not run) Using Python just to build an installer script where the users don't already have Python installed sounds like a bad idea to me. Especially on Windows(which given the mention of VC++ seems to be the platform). There are plenty of installer tools for Windows that would be more suitable and some of them are even free. If Python was already installed on your target PCs it would be a reasonable decision but, if not, you effectively need two installers - one for Python and another for your app. There are tools such as freeze or py2exe that can bundle up a Python script as a single exe file but that means installing the interpreter as part of the exe. You could of course delete the python exe bundle after the install completed but it still seems overkill for an install script. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Does the user need to install Python, when we deploy our c++ products using python?
On Tue, Dec 02, 2014 at 01:28:36PM -0700, gordon zhang wrote: I downloaded python 3.4.2 for c++ and create a vc++ project using python, but I have no idea what python dlls and other stuff needed to deploy the products. I know if we put Python34.dll and Python.dll in the folder of executable, it is not enough. What else do we need to put in the folder of executable?(the application does not run) I'm afraid I have very little idea what python 3.4.2 for c++ is, or what a vc++ project using python means. Do you mean Microsoft Visual C++? Where did you download this from? How do you use it in C++? C++ and Python are two very different languages. Normally people just use Python alone, or C++ alone. This mailing list is for beginners wanting to learn Python, and we may not have the experience needed to answer advanced questions like building C++ applications with embedded Python. For that, I recommend you ask on the comp.lang.python newsgroup, or on the forums of whatever embedded python software you are using. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] A small project
Santosh Kumar wrote: All, Any suggestion to my question ? On Thu, Nov 27, 2014 at 12:43 PM, Santosh Kumar rhce@gmail.com wrote: Hi All, I am planning to start a small project , so i need some suggestions on how to go about it. 1) It basically takes some inputs like (name,age,course,joining date,remarks) so on. 2) I want a light front end for it. If you use a web browser you may be able evade the deployment hassles. Here's a simple example: http://bottlepy.org/docs/dev/tutorial_app.html 3) i should be able to query for a particular person on a particular date or a joining date. 4) The app should be easily deployable both in the linux and windows machines. ( something like a .exe in windows) So this is my requirement as stated in the above four points. Now i need suggestions on how to go ahead with these. for 2) should i go for tkinter or do we have something even lighter. for 3) to achieve this should i just go for a database, if yes i need something which can be easily moved across. for 4) if i have to create a executable , how to make sure all the above requirements can be packaged into a application or a software. Thanks in advance. -- D. Santosh Kumar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Parse files and create sqlite3 db (Alan Gauld)
thanks so much,here you have the error: --- NameError Traceback (most recent call last) ipython-input-16-64f0293cca64 in module() 43 status = line[-4:] # and overall status pass/warn/fail 44 sql = insert into fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?); --- 45 data = (fileid,module, status,total,dup) 46 cursor.execute(sql, data) 47 elif (line[:2] != and line [:2] != ##): # grab details under each module NameError: name 'total' is not defined The problem It is I need to write only if exist that names and values. So from the original file each time I have this rows: ##FastQC0.10.1 Basic Statistics pass #MeasureValue FilenameR05_CTTGTA_L004_R1_001.fastq.gz File type Conventional base calls EncodingSanger / Illumina 1.9 Total Sequences 27868496 Filtered Sequences 0 Sequence length 50 %GC 50 END_MODULE So they need to be defined. So know I need to do: if Total and if Filename and Total then do the script? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Parse files and create sqlite3 db (Alan Gauld)
On 12/03/2014 08:07 AM, jarod...@libero.it wrote: thanks so much,here you have the error: --- NameError Traceback (most recent call last) ipython-input-16-64f0293cca64 in module() 43 status = line[-4:] # and overall status pass/warn/fail 44 sql = insert into fastqc_summary(fileid,module,status,total,duplicate) values(?,?,?,?,?); --- 45 data = (fileid,module, status,total,dup) 46 cursor.execute(sql, data) 47 elif (line[:2] != and line [:2] != ##): # grab details under each module NameError: name 'total' is not defined That's because you defined it only inside an if statement. So if that condition is false, total is NOT a variable. Perhaps you just want to give it a default value. I'd tend to do that in an else clause: if Total Sequence in line: total = line.split()[2] else: total = Similarly for fileid and dup. Now, you can test those values, or use their defaults directly, in whatever other place you need. While I've got you can I request a few things to make the forum go smoother: 1) use text not html messages. You did here, but in your original message you mentioned something about red and that doesn't show up everywhere. This is a text forum. 2) Use reply-list or reply-all, or whatever your email program will manage. Whenever you just compose a new message you start a new thread, and that just breaks the flow. Many newsreaders don't handle split threads, and those that do still cause pain for the user. 3) Don't use attachments. Just paste it in with everything else (in a text email, of course, as html messes up indentation about 50% of the time.) Many environments don't even see the attachment. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] How to split string into separate lines
Hi, I need help for doing this task. I know it will be simple but I am not able to do it. output_message_packet= fe01b8412756fe02fe01b9416239fe02fe01ba41ad88fe02fe01bb41e8e7fe02fe01bc4112fbfe02fe01bd415794fe02 I want to split this into separate message packets like this: fe01b8412756fe02 fe01b9416239fe02 fe01ba41ad88fe02 fe01bb41e8e7fe02 fe01bc4112fbfe02 fe01bd415794fe02 After this I want to split this into bytes: fe 01 b8 41 00 00 00 00 00 00 00 00 27 56 fe 02 Please help me with code for how to do this. Thanks in advance. Regards, Shweta ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Python
Why do I get SyntaxErrors when I type in an article like a or a colon : Enter an integer an gets listed as a SyntaxError x == 0: The colon gets flagged This is part of an if else program with an x = raw_input('Enter an integer: ') sort of statement above it. I've had trouble with - with the being flagged. I just don't get it. Thank you, Jack Jasper jackjas...@mac.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How to split string into separate lines
On Wed, Dec 3, 2014 at 1:18 PM, shweta kaushik coolshw...@gmail.com wrote: Hi, I need help for doing this task. I know it will be simple but I am not able to do it. output_message_packet= fe01b8412756fe02fe01b9416239fe02fe01ba41ad88fe02fe01bb41e8e7fe02fe01bc4112fbfe02fe01bd415794fe02 I want to split this into separate message packets like this: fe01b8412756fe02 fe01b9416239fe02 fe01ba41ad88fe02 fe01bb41e8e7fe02 fe01bc4112fbfe02 fe01bd415794fe02 After this I want to split this into bytes: fe 01 b8 41 00 00 00 00 00 00 00 00 27 56 fe 02 Please help me with code for how to do this. Thanks in advance. Regards, Shweta Look up python string methods to start. Does each line start with fe01? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor -- Joel Goldstick http://joelgoldstick.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python
On Wed, Dec 3, 2014 at 4:02 PM, jack jasper jackjas...@me.com wrote: Why do I get SyntaxErrors when I type in an article like a or a colon : Enter an integer an gets listed as a SyntaxError x == 0: The colon gets flagged This is part of an if else program with an colon only goes on end of def, if, elif, else, and for statements x = raw_input('Enter an integer: ') sort of statement above it. I've had trouble with - with the being flagged. I just don't get it. show your code, give version number (2.x or 3.x). I'm guessing 2.x since you are using raw_input Thank you, Jack Jasper jackjas...@mac.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor -- Joel Goldstick http://joelgoldstick.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How to split string into separate lines
On 03/12/14 18:18, shweta kaushik wrote: I need help for doing this task. I know it will be simple but I am not able to do it. Your description of the task is not very precise so I'll make some guesses below. output_message_packet= fe01b8412756fe02fe01b9410 I want to split this into separate message packets like this: fe01b8412756fe02 fe01b9416239fe02 It looks like the packet terminator is either fe02 or a fixed length record. It's not clear whether the fixed length is accidental or deliberate or whether its the terminator that is coincidentally the same. I'm guessing its probably based on terminator... To split by 'fe02' just use string split then add the separator back on: sep = 'fe02' packets = [pkt+sep for pkt in data.split(sep)] If it is based on length it will look something like: data = [] while s: data.append(s[:length]) s = s[length:] After this I want to split this into bytes: fe 01 b8 41 00 00 00 00 00 00 00 00 27 56 fe 02 This is the same split by length problem applied to each packet with length 2. In fact if it is split by length in both you could write a helper function: def split_by_length(s,length): data = [] while s: data.append(s[:length]) s = s[length:] return data And call it with packets = split_by_length(inputData, packetLength) bytes = [split_by_length(p,2) for p in packets] HTH -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python
On 03/12/14 21:02, jack jasper wrote: Why do I get SyntaxErrors when I type in an article like a or a colon : Enter an integer an gets listed as a SyntaxError x == 0: The colon gets flagged This is part of an if else program with an x = raw_input('Enter an integer: ') sort of statement above it. That doesn't help, just post real code that generates the error. And tell us which Python version and OS you are using to run it. I've had trouble with - with the being flagged. I have no idea what you mean by that. When do you ever use - in Python? -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor