[twitter-dev] Re: Read and store Twitter responses
On Mon, Apr 20, 2009 at 5:59 PM, Joseph wrote: > > This may not be the best thing to do in the case of statuses. > Optimization implies that you have two tables (minimum), one for the > user info, and one for the tweets. Doing a batch update, means that > you're skipping the step of checking to see if the user is already in > the database, so for every tweet, you will add the same user again. > That will you will slow you down much more than the batch advantage, > and will create confusion (unless you store all in one table, and > that's even more burdensome). There are a couple of ways to deal with this. Given sufficient memory, keep a hash of userIDs in memory and only insert the new ones. If memory consumption is a problem, assuming that the userID as the primary key in the user table, do an INSERT IGNORE for all of the users. With userID indexed, that will be quite fast. It won't be that simple if you have foreign key constraints, but I can't imagine referential integrity is critical for this sort of application. My system is far more constrained by things other than the insert speeds. Nick
[twitter-dev] Re: Read and store Twitter responses
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html On Mon, Apr 20, 2009 at 19:59, Joseph wrote: > > This may not be the best thing to do in the case of statuses. > Optimization implies that you have two tables (minimum), one for the > user info, and one for the tweets. Doing a batch update, means that > you're skipping the step of checking to see if the user is already in > the database, so for every tweet, you will add the same user again. > That will you will slow you down much more than the batch advantage, > and will create confusion (unless you store all in one table, and > that's even more burdensome). > > Now, does anyone know if there's some obscure version of UPDATE that > takes parameters to allow me to use UPDATE instead of INSERT (saving > me from the extra step of checking of the person is already in my > database). I'm fairly new to MySQL. > > On Apr 20, 4:14 pm, Nick Arnett wrote: > > On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams wrote: > > > > 3. For each status in the set, perform an SQL insert to save the status. > > > > Or, I would hope, create an array of inserts and do a multi-insert, which > > will be far faster than iterating through a list. > > > > http://www.desilva.biz/mysql/insert.html > > > > I'll bet you knew that, but I just had to note it because the performance > > difference is enormous. > > > > Nick > > (not really a PHP guy, but years of (often painfully gained) MySQL > > performance knowledge) > -- Abraham Williams | http://the.hackerconundrum.com Hacker | http://abrah.am | http://twitter.com/abraham Web608 | Community Evangelist | http://web608.org This email is: [ ] blogable [x] ask first [ ] private. Sent from Madison, Wisconsin, United States
[twitter-dev] Re: Read and store Twitter responses
This may not be the best thing to do in the case of statuses. Optimization implies that you have two tables (minimum), one for the user info, and one for the tweets. Doing a batch update, means that you're skipping the step of checking to see if the user is already in the database, so for every tweet, you will add the same user again. That will you will slow you down much more than the batch advantage, and will create confusion (unless you store all in one table, and that's even more burdensome). Now, does anyone know if there's some obscure version of UPDATE that takes parameters to allow me to use UPDATE instead of INSERT (saving me from the extra step of checking of the person is already in my database). I'm fairly new to MySQL. On Apr 20, 4:14 pm, Nick Arnett wrote: > On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams wrote: > > 3. For each status in the set, perform an SQL insert to save the status. > > Or, I would hope, create an array of inserts and do a multi-insert, which > will be far faster than iterating through a list. > > http://www.desilva.biz/mysql/insert.html > > I'll bet you knew that, but I just had to note it because the performance > difference is enormous. > > Nick > (not really a PHP guy, but years of (often painfully gained) MySQL > performance knowledge)
[twitter-dev] Re: Read and store Twitter responses
On Mon, Apr 20, 2009 at 5:41 PM, Doug Williams wrote: > Nick, > Batch INSERTs are great for people looking to for performance tweaks. > Serial INSERT statements within the iteration loop keeps things simple for > those just starting out. > True, of course... and now that I think about it, double-byte characters in the midst of a failing multi insert can be hard to figure out if you don't know what you're doing. Speaking of which, anybody who is getting started in this sort of thing - setting the default character set in MySQL to UTF-8 (before creating tables!) will help avoid a lot of confusion and headaches that drove me slightly nuts and I'm far from a newbie. Nick
[twitter-dev] Re: Read and store Twitter responses
Nick, Batch INSERTs are great for people looking to for performance tweaks. Serial INSERT statements within the iteration loop keeps things simple for those just starting out. Doug Williams Twitter API Support http://twitter.com/dougw On Mon, Apr 20, 2009 at 4:14 PM, Nick Arnett wrote: > > > On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams wrote: > > 3. For each status in the set, perform an SQL insert to save the status. > > > Or, I would hope, create an array of inserts and do a multi-insert, which > will be far faster than iterating through a list. > > http://www.desilva.biz/mysql/insert.html > > I'll bet you knew that, but I just had to note it because the performance > difference is enormous. > > Nick > (not really a PHP guy, but years of (often painfully gained) MySQL > performance knowledge) >
[twitter-dev] Re: Read and store Twitter responses
On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams wrote: 3. For each status in the set, perform an SQL insert to save the status. Or, I would hope, create an array of inserts and do a multi-insert, which will be far faster than iterating through a list. http://www.desilva.biz/mysql/insert.html I'll bet you knew that, but I just had to note it because the performance difference is enormous. Nick (not really a PHP guy, but years of (often painfully gained) MySQL performance knowledge)
[twitter-dev] Re: Read and store Twitter responses
I've broken the task into logical steps to get you started. I'd suggest searching Google and the wiki [1] for the libraries and implementation details for each: 1. Download a timeline or a set of statuses. 2. Iterate through that set of statuses, pulling out each individual status. 3. For each status in the set, perform an SQL insert to save the status. A great way to learn is to try and find sample code that gets you each of these steps separately, then put them together. There is plenty of PHP and MySQL sample code available online or in books to get you started. 1. http://apiwiki.twitter.com/Libraries#PHP Thanks, Doug Williams Twitter API Support http://twitter.com/dougw On Mon, Apr 20, 2009 at 1:53 PM, Andrew Badera wrote: > This isn't a SQL tutorial nor a MySQL list. Some might suggest you'd be > better off learning the basics of what you're trying to do -- learning how > to walk before you can run and all that. > > Thanks- > - Andy Badera > - and...@badera.us > - Google me: http://www.google.com/search?q=andrew+badera > > > > > On Mon, Apr 20, 2009 at 4:41 PM, CWitt wrote: > >> >> Is there anywhere I could take a look at some of this code to store >> the Twitter data in a MySQL databases? >> >> On Apr 19, 8:50 pm, Nick Arnett wrote: >> > On Sun, Apr 19, 2009 at 2:45 PM, CWitt wrote: >> > >> > > My skills are rather limited, but I was thinking PHP and MySQL. I was >> > > thinking about hiring it out, but putting together the process flow to >> > > help the programmer and also help me find the correct programmer. >> > >> > PHP and MySQL sound appropriate to what you're hoping to do. Storing >> > Twitter data in MySQL is generally not a big deal, since there is such >> > limited data. A lot of us have probably created similar schemas for >> that >> > purpose. The rest of your code sounds slightly more complex, especially >> if >> > you're trying to do some sort of natural language parsing, which is >> always >> > hard. I don't know if there are libraries in PHP for that purpose. >> There >> > are in other languages. >> > >> > In any case, without specifics, it would be hard for anyone to guide >> you. >> > >> > Nick >> > >
[twitter-dev] Re: Read and store Twitter responses
This isn't a SQL tutorial nor a MySQL list. Some might suggest you'd be better off learning the basics of what you're trying to do -- learning how to walk before you can run and all that. Thanks- - Andy Badera - and...@badera.us - Google me: http://www.google.com/search?q=andrew+badera On Mon, Apr 20, 2009 at 4:41 PM, CWitt wrote: > > Is there anywhere I could take a look at some of this code to store > the Twitter data in a MySQL databases? > > On Apr 19, 8:50 pm, Nick Arnett wrote: > > On Sun, Apr 19, 2009 at 2:45 PM, CWitt wrote: > > > > > My skills are rather limited, but I was thinking PHP and MySQL. I was > > > thinking about hiring it out, but putting together the process flow to > > > help the programmer and also help me find the correct programmer. > > > > PHP and MySQL sound appropriate to what you're hoping to do. Storing > > Twitter data in MySQL is generally not a big deal, since there is such > > limited data. A lot of us have probably created similar schemas for that > > purpose. The rest of your code sounds slightly more complex, especially > if > > you're trying to do some sort of natural language parsing, which is > always > > hard. I don't know if there are libraries in PHP for that purpose. > There > > are in other languages. > > > > In any case, without specifics, it would be hard for anyone to guide you. > > > > Nick >
[twitter-dev] Re: Read and store Twitter responses
Is there anywhere I could take a look at some of this code to store the Twitter data in a MySQL databases? On Apr 19, 8:50 pm, Nick Arnett wrote: > On Sun, Apr 19, 2009 at 2:45 PM, CWitt wrote: > > > My skills are rather limited, but I was thinking PHP and MySQL. I was > > thinking about hiring it out, but putting together the process flow to > > help the programmer and also help me find the correct programmer. > > PHP and MySQL sound appropriate to what you're hoping to do. Storing > Twitter data in MySQL is generally not a big deal, since there is such > limited data. A lot of us have probably created similar schemas for that > purpose. The rest of your code sounds slightly more complex, especially if > you're trying to do some sort of natural language parsing, which is always > hard. I don't know if there are libraries in PHP for that purpose. There > are in other languages. > > In any case, without specifics, it would be hard for anyone to guide you. > > Nick
[twitter-dev] Re: Read and store Twitter responses
On Sun, Apr 19, 2009 at 2:45 PM, CWitt wrote: > > My skills are rather limited, but I was thinking PHP and MySQL. I was > thinking about hiring it out, but putting together the process flow to > help the programmer and also help me find the correct programmer. PHP and MySQL sound appropriate to what you're hoping to do. Storing Twitter data in MySQL is generally not a big deal, since there is such limited data. A lot of us have probably created similar schemas for that purpose. The rest of your code sounds slightly more complex, especially if you're trying to do some sort of natural language parsing, which is always hard. I don't know if there are libraries in PHP for that purpose. There are in other languages. In any case, without specifics, it would be hard for anyone to guide you. Nick
[twitter-dev] Re: Read and store Twitter responses
My skills are rather limited, but I was thinking PHP and MySQL. I was thinking about hiring it out, but putting together the process flow to help the programmer and also help me find the correct programmer. On Apr 16, 10:52 am, Nick Arnett wrote: > On Wed, Apr 15, 2009 at 6:30 PM, CWitt wrote: > > > I've looked through the discussion, what I do understand is that it is > > acceptable to store Twitter search results in my own database. What I > > am wondering is how to extract this information and actually store it > > in my database. > > Broad question... what language(s) do you code in? What databases are you > familiar with? What is the web platform you are using? > > Nick
[twitter-dev] Re: Read and store Twitter responses
On Wed, Apr 15, 2009 at 6:30 PM, CWitt wrote: > > I've looked through the discussion, what I do understand is that it is > acceptable to store Twitter search results in my own database. What I > am wondering is how to extract this information and actually store it > in my database. Broad question... what language(s) do you code in? What databases are you familiar with? What is the web platform you are using? Nick