At 1:00 PM +0300 8/28/09, Petros Ziogas wrote:
I would just like to mention a point of failure in that automated proccess. I had to deal with this in a previous project so it's quite fresh.

What will happen if:

Problem 1

There are 3 articles. Article A is titled "History of America". Article B is titled "Glorious History of America". In article C there is this text "The book is talking about the glorious history of America". If you run an automated proccess and the test for article A comes first then the text will be "The book is talking about the glorious <a href="/id1111/">history of America</a>" and the next test will fail.

If you run a test for article B first the text will become "The book is talking about the <a href="/id2222/">glorious history of America</a>". Then if you test for article A it might end up being "The book is talking about the <a href="/id2222/">glorious <a href="/id1111/">history of America</a></a>"

The possibilities of such procedured practically ruining your content are endless. If you want to dive into tag nesting and html validation you will be opening another whole.

Problem 2

Also what will happen if an editor want to insert this "I loved the book <a href="LINKTOAMAZON">George Washington and the Glorious history of America</a>." and there are articles with titles using "George Washington", "Glorious history", "History of America", "America"?

I think you get my point...

Petros:

Yes,  I see your point and the two problems you raise (good concerns).

Problem 1

My initial solution would solve the first problem *provided* that the titles were unique and not contained within another title, right? So why not start with the longest title and search/replace downwards?

For example, "Glorious History of America" is searched, found, and made a link. Then "History of America" is searched -- however -- the search excludes links! The phrase "History of America" in "Glorious History of America" would never be considered because it's within a link.

The process would continue until you run out of titles -- simple, right?

Problem 2

The second problem can be solved two ways:

Way one -- by removing all organic links from the initial search. In other words, when the FULL TEXT search is started the search is done on articles absent of all organic links. You can easily add the organic links back-in after the search/replace is finished.

Please note when the automated links are added, they also have an unique class attribute, such as class="autotag", which will allow them to be easily identified and removed for a rebuild.

Way two -- you could solve the problem by excluding organic links from the search because they DO NOT have the unique class attribute identifier -- thus no real reason to remove them at all for the search/replace routine (i.e., Way 1). I only presented "Way 1" to get you to think in terms of removing the organic links from the problem.

Possible problem

The only fly in the ointment here would be if an editor wants to manually link an article by trying to mimic the automated process. For example, he/she inserts a "<a href="/id1111/">History of America</a>" using the *index* of the article. Everything would still work unless that article is deleted. In such case the link would become dead.

However, if the editor simply added the class identifier tag (i.e., class="autotag") to the link, then the automated process would treat his entry like it's own and adjust accordingly.

If the editors simply followed the rules, which aren't complicated, then editors could participate as they want in the process.

The solution presented here doesn't require tag nesting or html validation. As such, I don't see any additional problems -- do you?

Cheers,

tedd

--
-------
http://sperling.com  http://ancientstones.com  http://earthstones.com
_______________________________________________
New York PHP User Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk

http://www.nyphp.org/show_participation.php

Reply via email to