On Mon, Nov 16, 2009 at 09:41:47AM +0000, Syed Haider wrote:
> Hi Ariel,
> 
> I am pretty confident that its because of size of SNP dataset that takes 
> a long time to run large queries. The safest with SNP dataset is to use 
> web interface and ask for results as Email option. that would work.
> 
> Best,
> Syed

I wouldn't normally hijack a thread, but the original poster's question 
here is so similar to mine from yesterday....

I was also having trouble pulling a SNP dataset, because I was filtering 
on human-mouse homologs, which meant putting about 10K gene IDs in the 
ensembl_gene filter box. This gave timeouts or URL-length errors even in 
email mode.

Since I only wanted counts of SNPs per gene anyway, I wrote a script to 
pull the counts for each individual gene, and that worked fine for a 
while. But then I got a socket error at about the three thousandth query, 
and kept getting it when I repeated the query -- even from other machines. 
Moreover, the machine I'd been using originally stopped being able to pull 
up Biomart at all, even in a browser; it got server-closed-connection 
errors.

The next day, I tried again and everything was suddenly fine; the rest of 
my queries ran without incident.

So my question is: did I run into some sort of number of queries limit or 
bandwidth limit or something, or was it just some kind of odd session bug? 

It's not just an idle question, because I might need to do some more runs 
of this type. Also, if there's an obviously better way of getting this 
sort of data (counts of SNPs per gene), let me know...I may very well be 
missing something. Thanks!

=-=-> Jennifer Drummond // [email protected]

Reply via email to