[twitter-dev] Re: Read and store Twitter responses

2009-04-21 Thread Nick Arnett
On Mon, Apr 20, 2009 at 5:59 PM, Joseph  wrote:

>
> This may not be the best thing to do in the case of statuses.
> Optimization implies that you have two tables (minimum), one for the
> user info, and one for the tweets. Doing a batch update, means that
> you're skipping the step of checking to see if the user is already in
> the database, so for every tweet, you will add the same user again.
> That will you will slow you down much more than the batch advantage,
> and will create confusion (unless you store all in one table, and
> that's even more burdensome).


There are a couple of ways to deal with this. Given sufficient memory, keep
a hash of userIDs in memory and only insert the new ones.  If memory
consumption is a problem, assuming that the userID as the primary key in the
user table,  do an INSERT IGNORE for all of the users.  With userID indexed,
that will be quite fast.

It won't be that simple if you have foreign key constraints, but I can't
imagine referential integrity is critical for this sort of application.

My system is far more constrained by things other than the insert speeds.

Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Abraham Williams
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

On Mon, Apr 20, 2009 at 19:59, Joseph  wrote:

>
> This may not be the best thing to do in the case of statuses.
> Optimization implies that you have two tables (minimum), one for the
> user info, and one for the tweets. Doing a batch update, means that
> you're skipping the step of checking to see if the user is already in
> the database, so for every tweet, you will add the same user again.
> That will you will slow you down much more than the batch advantage,
> and will create confusion (unless you store all in one table, and
> that's even more burdensome).
>
> Now, does anyone know if there's some obscure version of UPDATE that
> takes parameters to allow me to use UPDATE instead of INSERT (saving
> me from the extra step of checking of the person is already in my
> database). I'm fairly new to MySQL.
>
> On Apr 20, 4:14 pm, Nick Arnett  wrote:
> > On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams  wrote:
> >
> > 3. For each status in the set, perform an SQL insert to save the status.
> >
> > Or, I would hope, create an array of inserts and do a multi-insert, which
> > will be far faster than iterating through a list.
> >
> > http://www.desilva.biz/mysql/insert.html
> >
> > I'll bet you knew that, but I just had to note it because the performance
> > difference is enormous.
> >
> > Nick
> > (not really a PHP guy, but years of (often painfully gained) MySQL
> > performance knowledge)
>



-- 
Abraham Williams | http://the.hackerconundrum.com
Hacker | http://abrah.am | http://twitter.com/abraham
Web608 | Community Evangelist | http://web608.org
This email is: [ ] blogable [x] ask first [ ] private.
Sent from Madison, Wisconsin, United States


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Joseph

This may not be the best thing to do in the case of statuses.
Optimization implies that you have two tables (minimum), one for the
user info, and one for the tweets. Doing a batch update, means that
you're skipping the step of checking to see if the user is already in
the database, so for every tweet, you will add the same user again.
That will you will slow you down much more than the batch advantage,
and will create confusion (unless you store all in one table, and
that's even more burdensome).

Now, does anyone know if there's some obscure version of UPDATE that
takes parameters to allow me to use UPDATE instead of INSERT (saving
me from the extra step of checking of the person is already in my
database). I'm fairly new to MySQL.

On Apr 20, 4:14 pm, Nick Arnett  wrote:
> On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams  wrote:
>
> 3. For each status in the set, perform an SQL insert to save the status.
>
> Or, I would hope, create an array of inserts and do a multi-insert, which
> will be far faster than iterating through a list.
>
> http://www.desilva.biz/mysql/insert.html
>
> I'll bet you knew that, but I just had to note it because the performance
> difference is enormous.
>
> Nick
> (not really a PHP guy, but years of (often painfully gained) MySQL
> performance knowledge)


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Nick Arnett
On Mon, Apr 20, 2009 at 5:41 PM, Doug Williams  wrote:

> Nick,
> Batch INSERTs are great for people looking to for performance tweaks.
> Serial INSERT statements within the iteration loop keeps things simple for
> those just starting out.
>

True, of course... and now that I think about it, double-byte characters in
the midst of a failing multi insert can be hard to figure out if you don't
know what you're doing.

Speaking of which, anybody who is getting started in this sort of thing -
setting the default character set in MySQL to UTF-8 (before creating
tables!) will help avoid a lot of confusion and headaches that drove me
slightly nuts and I'm far from a newbie.

Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Doug Williams
Nick,
Batch INSERTs are great for people looking to for performance tweaks. Serial
INSERT statements within the iteration loop keeps things simple for those
just starting out.

Doug Williams
Twitter API Support
http://twitter.com/dougw


On Mon, Apr 20, 2009 at 4:14 PM, Nick Arnett  wrote:

>
>
> On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams  wrote:
>
> 3. For each status in the set, perform an SQL insert to save the status.
>
>
> Or, I would hope, create an array of inserts and do a multi-insert, which
> will be far faster than iterating through a list.
>
> http://www.desilva.biz/mysql/insert.html
>
> I'll bet you knew that, but I just had to note it because the performance
> difference is enormous.
>
> Nick
> (not really a PHP guy, but years of (often painfully gained) MySQL
> performance knowledge)
>


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Nick Arnett
On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams  wrote:

3. For each status in the set, perform an SQL insert to save the status.


Or, I would hope, create an array of inserts and do a multi-insert, which
will be far faster than iterating through a list.

http://www.desilva.biz/mysql/insert.html

I'll bet you knew that, but I just had to note it because the performance
difference is enormous.

Nick
(not really a PHP guy, but years of (often painfully gained) MySQL
performance knowledge)


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Doug Williams
I've broken the task into logical steps to get you started. I'd suggest
searching Google and the wiki [1] for the libraries and implementation
details for each:

1. Download a timeline or a set of statuses.
2. Iterate through that set of statuses, pulling out each individual status.
3. For each status in the set, perform an SQL insert to save the status.

A great way to learn is to try and find sample code that gets you each of
these steps separately, then put them together. There is plenty of PHP and
MySQL sample code available online or in books to get you started.

1. http://apiwiki.twitter.com/Libraries#PHP

Thanks,
Doug Williams
Twitter API Support
http://twitter.com/dougw


On Mon, Apr 20, 2009 at 1:53 PM, Andrew Badera  wrote:

> This isn't a SQL tutorial nor a MySQL list. Some might suggest you'd be
> better off learning the basics of what you're trying to do -- learning how
> to walk before you can run and all that.
>
> Thanks-
> - Andy Badera
> - and...@badera.us
> - Google me: http://www.google.com/search?q=andrew+badera
>
>
>
>
> On Mon, Apr 20, 2009 at 4:41 PM, CWitt  wrote:
>
>>
>> Is there anywhere I could take a look at some of this code to store
>> the Twitter data in a MySQL databases?
>>
>> On Apr 19, 8:50 pm, Nick Arnett  wrote:
>> > On Sun, Apr 19, 2009 at 2:45 PM, CWitt  wrote:
>> >
>> > > My skills are rather limited, but I was thinking PHP and MySQL. I was
>> > > thinking about hiring it out, but putting together the process flow to
>> > > help the programmer and also help me find the correct programmer.
>> >
>> > PHP and MySQL sound appropriate to what you're hoping to do.  Storing
>> > Twitter data in MySQL is generally not a big deal, since there is such
>> > limited data.  A lot of us have probably created similar schemas for
>> that
>> > purpose.  The rest of your code sounds slightly more complex, especially
>> if
>> > you're trying to do some sort of natural language parsing, which is
>> always
>> > hard.  I don't know if there are libraries in PHP for that purpose.
>>  There
>> > are in other languages.
>> >
>> > In any case, without specifics, it would be hard for anyone to guide
>> you.
>> >
>> > Nick
>>
>
>


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Andrew Badera
This isn't a SQL tutorial nor a MySQL list. Some might suggest you'd be
better off learning the basics of what you're trying to do -- learning how
to walk before you can run and all that.

Thanks-
- Andy Badera
- and...@badera.us
- Google me: http://www.google.com/search?q=andrew+badera



On Mon, Apr 20, 2009 at 4:41 PM, CWitt  wrote:

>
> Is there anywhere I could take a look at some of this code to store
> the Twitter data in a MySQL databases?
>
> On Apr 19, 8:50 pm, Nick Arnett  wrote:
> > On Sun, Apr 19, 2009 at 2:45 PM, CWitt  wrote:
> >
> > > My skills are rather limited, but I was thinking PHP and MySQL. I was
> > > thinking about hiring it out, but putting together the process flow to
> > > help the programmer and also help me find the correct programmer.
> >
> > PHP and MySQL sound appropriate to what you're hoping to do.  Storing
> > Twitter data in MySQL is generally not a big deal, since there is such
> > limited data.  A lot of us have probably created similar schemas for that
> > purpose.  The rest of your code sounds slightly more complex, especially
> if
> > you're trying to do some sort of natural language parsing, which is
> always
> > hard.  I don't know if there are libraries in PHP for that purpose.
>  There
> > are in other languages.
> >
> > In any case, without specifics, it would be hard for anyone to guide you.
> >
> > Nick
>


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread CWitt

Is there anywhere I could take a look at some of this code to store
the Twitter data in a MySQL databases?

On Apr 19, 8:50 pm, Nick Arnett  wrote:
> On Sun, Apr 19, 2009 at 2:45 PM, CWitt  wrote:
>
> > My skills are rather limited, but I was thinking PHP and MySQL. I was
> > thinking about hiring it out, but putting together the process flow to
> > help the programmer and also help me find the correct programmer.
>
> PHP and MySQL sound appropriate to what you're hoping to do.  Storing
> Twitter data in MySQL is generally not a big deal, since there is such
> limited data.  A lot of us have probably created similar schemas for that
> purpose.  The rest of your code sounds slightly more complex, especially if
> you're trying to do some sort of natural language parsing, which is always
> hard.  I don't know if there are libraries in PHP for that purpose.  There
> are in other languages.
>
> In any case, without specifics, it would be hard for anyone to guide you.
>
> Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-19 Thread Nick Arnett
On Sun, Apr 19, 2009 at 2:45 PM, CWitt  wrote:

>
> My skills are rather limited, but I was thinking PHP and MySQL. I was
> thinking about hiring it out, but putting together the process flow to
> help the programmer and also help me find the correct programmer.


PHP and MySQL sound appropriate to what you're hoping to do.  Storing
Twitter data in MySQL is generally not a big deal, since there is such
limited data.  A lot of us have probably created similar schemas for that
purpose.  The rest of your code sounds slightly more complex, especially if
you're trying to do some sort of natural language parsing, which is always
hard.  I don't know if there are libraries in PHP for that purpose.  There
are in other languages.

In any case, without specifics, it would be hard for anyone to guide you.

Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-19 Thread CWitt

My skills are rather limited, but I was thinking PHP and MySQL. I was
thinking about hiring it out, but putting together the process flow to
help the programmer and also help me find the correct programmer.


On Apr 16, 10:52 am, Nick Arnett  wrote:
> On Wed, Apr 15, 2009 at 6:30 PM, CWitt  wrote:
>
> > I've looked through the discussion, what I do understand is that it is
> > acceptable to store Twitter search results in my own database. What I
> > am wondering is how to extract this information and actually store it
> > in my database.
>
> Broad question... what language(s) do you code in?  What databases are you
> familiar with?  What is the web platform you are using?
>
> Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-16 Thread Nick Arnett
On Wed, Apr 15, 2009 at 6:30 PM, CWitt  wrote:

>
> I've looked through the discussion, what I do understand is that it is
> acceptable to store Twitter search results in my own database. What I
> am wondering is how to extract this information and actually store it
> in my database.


Broad question... what language(s) do you code in?  What databases are you
familiar with?  What is the web platform you are using?

Nick