In recent months they have tried to follow links in HTML forms to find links to sites that Google would not otherwise be able to find and index for users searching on Google. This is especially true when they find a “FORM” elements of a quality website. Google makes most likely when a number of “issues” (searches) in the form. In the case of text boxes, they’ll automatically choose words from the page that has the form to select menus, “check boxes” and “radio buttons” from the form of values ??contained in the code. By choosing values ??for each selection, generates and executes them in an attempt to find new pages / sites. If they then find the pages / sites which are not included in its index, it will be added to the database as any page.
Perhaps needless to say, but this should have made them long ago. However, only a few värdefullal sites online that will get this “treatment”. Google’s robots (googlebot) always takes account of robots.txt, nofollow, and noindex guidelines as site owner set up. This means that if in the robots.txt excluded form the robots will be taken into account and not index the content found in the form. To add is that Google only collects data that is tagged with “get” command. They avoid forms that require any user input. For example undwiks all forms that have password-tagging or otherwise linked to user information like login, user id, contacts, and more.
The pages found on this enhanced indexing means will not be at the expense of the normal pages already indexed. This change will not reduce any PageRank value for the other pages. This will therefore only increase the number of pages that are indexed for the site on Google. The change will not affect the spindling or ranking.
This experiment is intended to expand Google’s coverage over the Web. HTML forms have long been a source of lots of content that was not reachable by search engine robots before. By being able to spider html form (which is not excluded in robots.txt), Google will be ball-lead users to documents that they would not otherwise have access to by using a search engine.