Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
I've seen this idea before, and the main problem is that digitizing scanned words and CAPTCHA are at cross-purposes. The problem in digitizing is that the computer doesn't know the word. In CAPTCHA, the computer knows the word, and it needs to in order to validate the user. If you don't know for sure that the word was typed in correctly, you can't validate the user. CAPTCHA words can be used to validate once they're known, but that kind of defeats the purpose. You could just take the majority answer, but in order to gather a strong majority you would have to let some minority answers through, some of which may be invalid users who should not be allowed access. I suspect using digitized text for CAPTCHA would not provide as much use on the digitization side as one might think. Jake On 10/2/07, Jon D. [EMAIL PROTECTED] wrote: Here's an idea... Some of you may have seen today's (and previous) Slashdot links on reCaptcha, a cool idea that's starting to be more commonly-used: http://news.bbc.co.uk/2/hi/technology/7023627.stm http://recaptcha.net/learnmore.html Basically they're using a CAPTCHA to digitize old scanned books.[1] This could be applied to handwritten historic records. However, it might be hard to trust regular schmoes to correctly transcribe handwritten historic texts. One way to address this might be to just ask more people the same word, and if they all (or mostly) match, we can be fairly certain it's transcribed correctly. Or this could just be used to verify a previous manual transcription. Thoughts? -Jon [1] FYI, a CAPTCHA is where you have to type a distorted word - to stop spammers hackers. For example, when you mistype your password to enter gmail or yahoo mail enough times, it'll require you to type in a word that's blurred. The new application of this anti-spam technique is to use scanned books as the source of words. Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
On 10/2/07, Jon D. [EMAIL PROTECTED] wrote:Here's an idea... Some of you may have seen today's (and previous) Slashdot links on reCaptcha, a cool idea that's starting to be more commonly-used: http://news.bbc.co.uk/2/hi/technology/7023627.stm http://recaptcha.net/learnmore.html Basically they're using a CAPTCHA to digitize old scanned books.[1] I blogged about this several months ago. I think it's awesome technology and a great way to use something intended for another purpose. I've implemented it on my blog, and strongly encourage any others to use it as well (have had issues on mobile phones with it however - not sure if they've worked around that). Jesse -- #!/usr/bin/perl $^=q;@!~|{krwyn{u$$Sn||n|}j=$$Yn{uQjltn{ 0gFzD gD, 00Fz, 0,,( 0hF 0g)F/=, 0 L$/GEIFewe{,$/ 0C$~ @=,m,|,(e 0.), 01,pnn,y{ rw} ;,$0=q,$,,($_=$^)=~y,$/ C-~@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
But most of these points are in fact addressed by reCaptcha. The idea given below was simply using handwritten texts, instead of printed books as input, which would require just a little bit more verification of accuracy. -Jon --- Jacob Sorensen [EMAIL PROTECTED] wrote: I've seen this idea before, and the main problem is that digitizing scanned words and CAPTCHA are at cross-purposes. The problem in digitizing is that the computer doesn't know the word. In CAPTCHA, the computer knows the word, and it needs to in order to validate the user. If you don't know for sure that the word was typed in correctly, you can't validate the user. CAPTCHA words can be used to validate once they're known, but that kind of defeats the purpose. You could just take the majority answer, but in order to gather a strong majority you would have to let some minority answers through, some of which may be invalid users who should not be allowed access. I suspect using digitized text for CAPTCHA would not provide as much use on the digitization side as one might think. Jake On 10/2/07, Jon D. [EMAIL PROTECTED] wrote: Here's an idea... Some of you may have seen today's (and previous) Slashdot links on reCaptcha, a cool idea that's starting to be more commonly-used: http://news.bbc.co.uk/2/hi/technology/7023627.stm http://recaptcha.net/learnmore.html Basically they're using a CAPTCHA to digitize old scanned books.[1] This could be applied to handwritten historic records. However, it might be hard to trust regular schmoes to correctly transcribe handwritten historic texts. One way to address this might be to just ask more people the same word, and if they all (or mostly) match, we can be fairly certain it's transcribed correctly. Or this could just be used to verify a previous manual transcription. Thoughts? -Jon [1] FYI, a CAPTCHA is where you have to type a distorted word - to stop spammers hackers. For example, when you mistype your password to enter gmail or yahoo mail enough times, it'll require you to type in a word that's blurred. The new application of this anti-spam technique is to use scanned books as the source of words. Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
Having a valid CAPTCHA and then a digitization problem is okay, but recognize it doesn't mean that CAPTCHA can validly be used for digitization, or vice versa -- it just means that you've added a service element onto the CAPTCHA so people can do some useful work at the same time they are validating themselves (on a different problem). Using handwriting for CAPTCHA can be good for some things, but you have to be careful because certain texts can have writing that regular people will misrecognize. For example: a 200 year old American text with a letter that will get entered by the majority as B. Jake On 10/2/07, Jon D. [EMAIL PROTECTED] wrote: But most of these points are in fact addressed by reCaptcha. The idea given below was simply using handwritten texts, instead of printed books as input, which would require just a little bit more verification of accuracy. -Jon --- Jacob Sorensen [EMAIL PROTECTED] wrote: I've seen this idea before, and the main problem is that digitizing scanned words and CAPTCHA are at cross-purposes. The problem in digitizing is that the computer doesn't know the word. In CAPTCHA, the computer knows the word, and it needs to in order to validate the user. If you don't know for sure that the word was typed in correctly, you can't validate the user. CAPTCHA words can be used to validate once they're known, but that kind of defeats the purpose. You could just take the majority answer, but in order to gather a strong majority you would have to let some minority answers through, some of which may be invalid users who should not be allowed access. I suspect using digitized text for CAPTCHA would not provide as much use on the digitization side as one might think. Jake On 10/2/07, Jon D. [EMAIL PROTECTED] wrote: Here's an idea... Some of you may have seen today's (and previous) Slashdot links on reCaptcha, a cool idea that's starting to be more commonly-used: http://news.bbc.co.uk/2/hi/technology/7023627.stm http://recaptcha.net/learnmore.html Basically they're using a CAPTCHA to digitize old scanned books.[1] This could be applied to handwritten historic records. However, it might be hard to trust regular schmoes to correctly transcribe handwritten historic texts. One way to address this might be to just ask more people the same word, and if they all (or mostly) match, we can be fairly certain it's transcribed correctly. Or this could just be used to verify a previous manual transcription. Thoughts? -Jon [1] FYI, a CAPTCHA is where you have to type a distorted word - to stop spammers hackers. For example, when you mistype your password to enter gmail or yahoo mail enough times, it'll require you to type in a word that's blurred. The new application of this anti-spam technique is to use scanned books as the source of words. Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
[Ldsoss] BYU rss feed?
I've got a software program I'd like to develop here soon that will require regular text updates of BYU (and maybe other college football teams?) games, especially football. Is anyone aware of a service somewhere that provides rss play-by-play of BYU football games? Thanks, Jesse -- #!/usr/bin/perl $^=q;@!~|{krwyn{u$$Sn||n|}j=$$Yn{uQjltn{ 0gFzD gD, 00Fz, 0,,( 0hF 0g)F/=, 0 L$/GEIFewe{,$/ 0C$~ @=,m,|,(e 0.), 01,pnn,y{ rw} ;,$0=q,$,,($_=$^)=~y,$/ C-~@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] BYU rss feed?
Does ESPN do college game RSS feeds? On 10/2/07, Jesse Stay [EMAIL PROTECTED] wrote: I've got a software program I'd like to develop here soon that will require regular text updates of BYU (and maybe other college football teams?) games, especially football. Is anyone aware of a service somewhere that provides rss play-by-play of BYU football games? Thanks, Jesse -- #!/usr/bin/perl $^=q;@!~|{krwyn{u$$Sn||n|}j=$$Yn{uQjltn{ 0gFzD gD, 00Fz, 0,,( 0hF 0g)F/=, 0 L$/GEIFewe{,$/ 0C$~ @=,m,|,(e 0.), 01,pnn,y{ rw} ;,$0=q,$,,($_=$^)=~y,$/ C-~@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
The inventor of the captcha calls this Human Computation. He gave an interesting talk at Google on the subject that you can watch here: http://video.google.com/videoplay?docid=-8246463980976635143 He presents it very well and even non-techies (like my wife) enjoyed watching this when I showed them. OK, it was just my wife, but I'm sure others would like it too. Bryan ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
Sounds like a good way to do genealogical indexing. Someone should tell the church ;) Also sounds like an interesting business idea. Farm out captchas to blogs, and pay people for using the captcha On 10/2/07, Jon D. [EMAIL PROTECTED] wrote: Here's an idea... Some of you may have seen today's (and previous) Slashdot links on reCaptcha, a cool idea that's starting to be more commonly-used: http://news.bbc.co.uk/2/hi/technology/7023627.stm http://recaptcha.net/learnmore.html Basically they're using a CAPTCHA to digitize old scanned books.[1] This could be applied to handwritten historic records. However, it might be hard to trust regular schmoes to correctly transcribe handwritten historic texts. One way to address this might be to just ask more people the same word, and if they all (or mostly) match, we can be fairly certain it's transcribed correctly. Or this could just be used to verify a previous manual transcription. Thoughts? -Jon [1] FYI, a CAPTCHA is where you have to type a distorted word - to stop spammers hackers. For example, when you mistype your password to enter gmail or yahoo mail enough times, it'll require you to type in a word that's blurred. The new application of this anti-spam technique is to use scanned books as the source of words. Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
On 10/2/07, Bryan Murdock [EMAIL PROTECTED] wrote: The inventor of the captcha calls this Human Computation. He gave an interesting talk at Google on the subject that you can watch here: http://video.google.com/videoplay?docid=-8246463980976635143 He presents it very well and even non-techies (like my wife) enjoyed watching this when I showed them. OK, it was just my wife, but I'm sure others would like it too. Bryan OK, I just read the fine article and it's the same guy doing the reCAPTCHA thing, Luis von Ahn. This is cool stuff. If you don't have time to watch the video above (which you really should) I'll just tell you. He does a similar thing for image recognition using some online games: http://www.espgame.org/ http://www.peekaboom.org/ Genius. Using it for genealogy indexing seems like a great idea too. Bryan ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
On 10/2/07, m h [EMAIL PROTECTED] wrote: Sounds like a good way to do genealogical indexing. Someone should tell the church ;) Also sounds like an interesting business idea. Farm out captchas to blogs, and pay people for using the captcha Seth Godin actually already proposed this idea - it's open for anyone to try!: http://sethgodin.typepad.com/seths_blog/2006/12/commercializing.html Another thing to look into is Amazon's Mechanical Turk: http://www.mturk.com/mturk/welcome Jesse -- #!/usr/bin/perl $^=q;@!~|{krwyn{u$$Sn||n|}j=$$Yn{uQjltn{ 0gFzD gD, 00Fz, 0,,( 0hF 0g)F/=, 0 L$/GEIFewe{,$/ 0C$~ @=,m,|,(e 0.), 01,pnn,y{ rw} ;,$0=q,$,,($_=$^)=~y,$/ C-~@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss
Re: [Ldsoss] Digitizing handwritten records by stopping spammers (or vice versa)
On 10/2/07, Jesse Stay [EMAIL PROTECTED] wrote: On 10/2/07, m h [EMAIL PROTECTED] wrote: Sounds like a good way to do genealogical indexing. Someone should tell the church ;) Also sounds like an interesting business idea. Farm out captchas to blogs, and pay people for using the captcha Seth Godin actually already proposed this idea - it's open for anyone to try!: http://sethgodin.typepad.com/seths_blog/2006/12/commercializing.html Great minds think alike (or think the same thing a year later). One issue with his idea is he doesn't use the results for anything useful... (Admittedly if someone made me translate immigration records to reply to a blog I probably wouldn't do it much) Another thing to look into is Amazon's Mechanical Turk: Yes, this is a specialized version of the turk. As I understand it now, genealogical indexing is done by Volunteer Turks -matt ___ Ldsoss mailing list Ldsoss@lists.ldsoss.org http://lists.ldsoss.org/mailman/listinfo/ldsoss