back | forward 
General
On this tab you can add, modify or remove existing project URLs as well as set scan depth limits and define custom user agent.
Click Add if you want to enter a new site URL. Select Edit, if you want to change the selected one. After pressing Add the following window will pop up:

In the URL field enter domain name of your site. Include "www" prefix and site's domain name with the correct "dot" extension - com, net, org, etc. For example: www.LinkUtility.com, where www is prefix, LinkUtility - domain name, and com is the extension. Program will automatically add http:// protocol to your URL, and it will look like the following: http://www.LinkUtility.com. You can open the web site's start page in your web browser by clicking . If you want to work with a copy of your web site saved locally, select Local Files check box, click Browse button and show the path to the site's start page.

Site Authentication. Some web sites require authentication from the user. If your web site requires it, select Site requires authentication check box and enter your user name and password into appropriate fields below.
Limits. You can set limits for maximum scan depth and total number of links to be checked. Some web sites may have complicated structure, which means that auser needs to make several clicks to get to some of them. So, the start page is the site's "zero" level. Start page usually has links to most pages of a web site. If you set the maximum scan depth to 1, the program will scan only the default page, and pages it refers to. All other links will be ignored, hence, links to web pages, linked to by the first level pages will be ignored. By default maximum scan value is set to zero, which means there are no restrictions as to the exploration depth, and the site will be scanned down to the last level. Just remember that the deeper the search the longer it will take.
Maximum number of links restricts site scan to the user defined quantity of links the program will check.
User-Agents are one of many environmental variables that the web server gets from the visitor. ( HTTP_USER_AGENT ). For example, when a visitor visits this site, with Windows 98, running Microsoft Internet Explorer 5.5, the HTTP_USER_AGENT would look something like this:
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Every visitor that visits your site leaves user-agent info that can be logged.
See also: What Is a Mozilla User Agent?
Link Utility allows you to return a custom user-agent to the server. Some webmasters use robots.txt file to restrict access to the site, or its part may for certain user agents. You can select Stick to robots.txt limitations if you want the program to obey the file's instructions or deselect this check box if you want the program to ignore the file.
Allow cookies Some sites won't respond if you do not allow them to place a cookie onto your computer. Selecting Allow cookies check box you will allow web server to place a cookie onto your computer.
Note. Cookies are pieces of information generated by a Web server and stored in the user's computer, ready for future access. Cookies are embedded in the HTML information flowing back and forth between the user's computer and the servers. Cookies were implemented to allow user-side customization of Web information. For example, cookies are used to personalize Web search engines, to allow users to participate in WWW-wide contests (but only once!), and to store shopping lists of items a user has selected while browsing through a virtual shopping mall.
External links
On this tab you can set external links options.
Consider links with prefixes external As previously mentioned, a typical URL contains prefix (usually www), domain name and a "dot" extension (.com, net, etc). When creating a site some web masters create pages so that when navigating between them one more prefix appears. Example: page www.yoursite.com may contain a link to an internal page www. prefix.yoursite.com. If you select Consider links with prefixes external check box, all links of your web site with prefixes will be marked as external.
Check external links option enables the program to verify links outside your site.
There are two additional options that allow either to include or to exclude links containing certain masks, or word patterns. So, if you have two different web sites, and want to include their cross-reference links into the search, enter parts of these cross-reference links into the mask field. You can exclude some of the internal links from the check by entering appropriate masks into the other field.
Exclude Links
If you want links containing certain masks to be excluded from the scan, enter these masks into the Exclude field. All links, containing any of these masks will be marked as external, and pages they lead to will not be scanned. If you enter a mask into the Ignore field, links, matching this mask will be ignored, and will not be displayed.
Directory Index
When you enter an URL into your browser's address bar, it looks like the following: http://www.myhost.com. This is the address of the root folder of your web-resource. But the browser opens a page, not a folder. How does it happen? The trick is that the web server automatically loads the default page into your browser. The default page usually has one of the following names: default.asp, index.htm, index. html, index.shtml, etc. Thus, full address of your default page is something like this: http://www.myhost.com/index.html . Link Utility may index this page twice, with and without the default pages address. To exclude double indexing the same default page you can enter its name into the Index file names field.
Scripts Analysis
A typical link has the following hypertext structure: <a href ="http://www.yourhost.com"> Some text </a> .You can check this by opening a page with any text editor, like Notepad, or selecting View>Source in the Internet Explorer Main Menu. The program easily identifies the tag <a> and recognizes the text in quotation marks as a link. But sometimes links may be embedded into the bodies of scripts - Java script, Cascading Styles Sheets, Macromedia Flash objects, etc. A user can click through these links in his web-browser, but it is difficult for automated applications to scan these objects. If you enable Script analysis, the program will scan the body of the document for Java script objects, and look for links inside these objects.You can set certain parameters for script analysis.
- Thus, if you select Search by protocole , Link Utility will recognize links inside Java script by the http:// protocol prefix and index them.
- Not all links contain http:// protocole prefix. The program can recognize links by the typical html documents and multimedia files extensions - .html, .htm, .php, .jpeg, .mp3, etc.
- If the link leads to a page inside your web site, it is called a relative link, as it shows the path to the target page without the protocole and site URL. For example, if the root folder of your web site contains a subfolder documents , which, in its turn contains file docs.html , its full address will be like the following: http://www.yourhost.com/documents/docs.html. But if the link is on the default web page, which usually sits in the site's root folder, there is no need to give full path to the document. documents/docs.html , where documents is the path to the folder, and docs. html is the path to the target document, is enough. You may enable relative links search by / (directory delimiter) they may contain.
Advanced script analysis
In this tab you can set advanced parameters for scripts analysis - Java script code, .css (Cascading Styles Sheets) files and objects and Macromedia Flash objects.
- The first eight check boxes provide additional scan options within Java script tags. You can set the program to analyze Java script events (OnClick, MouseOver), parameter "Value=", "Script" section as well as external Java script files (.js).
- Two check boxes provide set scan options within <style> tags and external .css files.
- If you select the last check box the program will analyze Macromedia Flash objects.
See also:
JavaScript tutorial
Cascading Style Sheets
Macromedia Flash official site
Verify Links
This tab provides options for link scan within certain HTML tags. For information about HTML tags and links they may contain, see on the Internet: Understanding HTML Tags
Error pages
When a web browser requests a page that cannot be found on the site, the web-server usually returns the so-called 404 error code. The most common reason for receiving error messages is that you have mistyped the address, or the page you are requesting has been deleted or moved. But sometimes webmasters provide special custom 404 error pages, and the web-server automatically redirects you to them. Usually it is an HTML document saying that the requested page could not be found and offering some solutions to rectify the problem. Since the program does not receive a valid error code, it may erroneously treat such error pages as "good" ones. To avoid this and identify such links as broken you may enter masks that will enable the program to identify redirects to error pages.
Error Pages>Redirects. This option allows you to set redirect masks. For example, if you request file with the following URL: http://www.yourhost.com/documents/docs.html and it is not there, the server may redirect you to the error page http://www.yourhost.com/error/404.html . This means that the error page, named 404.html is located in the subfolder error in the site's root folder. So you may create a mask error/404.html , enter it into the Redirect field and the program will "know" it is an error, not an ordinary redirect.
Error Pages>Titles In this field you can enter typical titles of custom error pages. The program will scan the title of a page, and if it matches one of the masks you have entered, it will identify the whole page as an error page.
Page Optimization
In this tab you can set parameters for marking your pages as slow/small, old/new or deep.
- Slow pages Slow and small page parameter is defined by the size of the page including all multimedia files (pictures, music, Flash objects). The bigger is the page, the slower it loads onto the computer. By default size limit for slow pages is set to 50 kilobytes, i.e. all pages exceeding this size will be considered slow.
- Small Pages Pages, not bigger than the specified parameter will be considered small. By default the value is set to 2 kB.
- New and Old pages You can set parameters for pages to be considered either old or new.
- Deep pages Pages sitting deeper in the directory tree than the specified number of levels will be considered deep.
- Slow images Images, exceeding the specified size, will be considered slow.
Orphan Analysis
Orphaned files are files stored in your web folders but not linked to by any other page on your site. A file becomes broken for a number of reasons: a web page with the references to this file has been removed or modified, or the file's name has been changed. As your web site grows, you can easily end up with lots of pages that are not really used in your web. In other words, they sit there in your web, but there are no links pointing to them, meaning nobody is going to ever see the file by following links in your web site.
You have two options: local search and search via FTP. If you check for orphaned files locally, select 'Local or network computer' in Project Settings> Orphan Analysis. Then you should specify an initial directory and index file of your local copy.
If you want to search for orphaned files directly on your web server, you should select 'Search via FTP' in Options> Orphaned files and set up your FTP connection. If you select FTP server radio button the following fields will appear:
- Server name - enter the name of your FTP server
- Port - FTP port (usually 21)
- Username - your username which allows you entering and modifying your FTP folder
- Password - associated with your username password
- Passive mode for FTP transfers - see FTP - Active And Passive Modes
Next two tabs provide options for the following:
- Initial Paths - show the initial path to your site's folder on the web-server, or saved locally.
- Exclude files There are files which perform auxiliary function, like logfiles, robots.txt and .htaccess files. Those files don't usually have incoming links because they are not meant for being accessed by visitors. Logfiles are created by the server computer to trace visitors coming to the site, robots.txt contains directives for automatic search engine robots indexing the site, and all files starting with .ht (starting with dot), like .htaccess contain server directives for Apache servers. These files should be excluded from being flagged orphans. You may set the program to identify these files and exclude them from the search by entering filenames or directory masks . So, if you enter the following string into the field: .htaccess, all files with this name will be excluded from the check. Another example: server log-files are usually stored in a separate directory. So if you enter a mask apache/htdocs , all files in the subfolder htdocs in the folder apache will not be included into the check.
Availability
Availability scan is used to check the facility of access to your web-site. Anonymous proxies used for the Availability scan are addresses of proxy servers located all over the world and can be used to check the access speed to your site from different locations. If the access speed is too low you can move your web site to other web-server with better internet accessibility or with faster internet connection. You can update the list of anonymous proxies from the program's support server.
See also:
Anonymous Proxy Server
Updates
back | forward 
|