Hi Tedd, That were some good ideas about the problems that I thought will arise.
What you forgeting is that you can't just decide what is more important and what is not. I mean it might be that logically the short article (e.g. America) should be linked and not the long one. If the articles are many and have organic (sorry I can't think of a better word) titles then it will produce weird results. I still think that automatically cross linking articles that might have small titles is a bad idea. Imagine imdb doing that and trying to crosslink the movie "Z" and the movie "IT" inside it's editors articles and reviews. I think that the best living example of this is the ad networks that link words to advertisements inside web site content. They can make a text almost unreadable when they insert the ad links. Petros Ziogas http://www.royalblue.gr On Fri, Aug 28, 2009 at 6:05 PM, Chuck Reeves <[email protected]>wrote: > Another little add on to these solutions would be to include some kind of > counter that will prevent the bot form linking more then X times. This way > when the article loads, it will not be just one big stream of links. > > Thank You > Chuck Reeves > Cell: 631-374-0772 > Email: [email protected] > > > On Fri, Aug 28, 2009 at 10:51 AM, tedd <[email protected]> wrote: > >> At 1:00 PM +0300 8/28/09, Petros Ziogas wrote: >> >>> I would just like to mention a point of failure in that automated >>> proccess. I had to deal with this in a previous project so it's quite fresh. >>> >>> What will happen if: >>> >> >> Problem 1 >> >> There are 3 articles. Article A is titled "History of America". Article B >>> is titled "Glorious History of America". In article C there is this text >>> "The book is talking about the glorious history of America". If you run an >>> automated proccess and the test for article A comes first then the text will >>> be "The book is talking about the glorious <a href="/id1111/">history of >>> America</a>" and the next test will fail. >>> >>> If you run a test for article B first the text will become "The book is >>> talking about the <a href="/id2222/">glorious history of America</a>". Then >>> if you test for article A it might end up being "The book is talking about >>> the <a href="/id2222/">glorious <a href="/id1111/">history of >>> America</a></a>" >>> >>> The possibilities of such procedured practically ruining your content are >>> endless. If you want to dive into tag nesting and html validation you will >>> be opening another whole. >>> >> >> Problem 2 >> >> Also what will happen if an editor want to insert this "I loved the book >>> <a href="LINKTOAMAZON">George Washington and the Glorious history of >>> America</a>." and there are articles with titles using "George Washington", >>> "Glorious history", "History of America", "America"? >>> >>> I think you get my point... >>> >> >> Petros: >> >> Yes, I see your point and the two problems you raise (good concerns). >> >> Problem 1 >> >> My initial solution would solve the first problem *provided* that the >> titles were unique and not contained within another title, right? So why not >> start with the longest title and search/replace downwards? >> >> For example, "Glorious History of America" is searched, found, and made a >> link. Then "History of America" is searched -- however -- the search >> excludes links! The phrase "History of America" in "Glorious History of >> America" would never be considered because it's within a link. >> >> The process would continue until you run out of titles -- simple, right? >> >> Problem 2 >> >> The second problem can be solved two ways: >> >> Way one -- by removing all organic links from the initial search. In other >> words, when the FULL TEXT search is started the search is done on articles >> absent of all organic links. You can easily add the organic links back-in >> after the search/replace is finished. >> >> Please note when the automated links are added, they also have an unique >> class attribute, such as class="autotag", which will allow them to be easily >> identified and removed for a rebuild. >> >> Way two -- you could solve the problem by excluding organic links from the >> search because they DO NOT have the unique class attribute identifier -- >> thus no real reason to remove them at all for the search/replace routine >> (i.e., Way 1). I only presented "Way 1" to get you to think in terms of >> removing the organic links from the problem. >> >> Possible problem >> >> The only fly in the ointment here would be if an editor wants to manually >> link an article by trying to mimic the automated process. For example, >> he/she inserts a "<a href="/id1111/">History of America</a>" using the >> *index* of the article. Everything would still work unless that article is >> deleted. In such case the link would become dead. >> >> However, if the editor simply added the class identifier tag (i.e., >> class="autotag") to the link, then the automated process would treat his >> entry like it's own and adjust accordingly. >> >> If the editors simply followed the rules, which aren't complicated, then >> editors could participate as they want in the process. >> >> The solution presented here doesn't require tag nesting or html >> validation. As such, I don't see any additional problems -- do you? >> >> Cheers, >> >> tedd >> >> >> -- >> ------- >> http://sperling.com http://ancientstones.com http://earthstones.com >> _______________________________________________ >> New York PHP User Group Community Talk Mailing List >> http://lists.nyphp.org/mailman/listinfo/talk >> >> http://www.nyphp.org/show_participation.php >> > > > _______________________________________________ > New York PHP User Group Community Talk Mailing List > http://lists.nyphp.org/mailman/listinfo/talk > > http://www.nyphp.org/show_participation.php >
_______________________________________________ New York PHP User Group Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk http://www.nyphp.org/show_participation.php
