Creating a file content crawler with ColdFusion....
Creating a file content crawler with ColdFusion....
This tutorial will show you how to create a
local file crawler that will enable you to find a specified document type (i.e.
PDF files) within a directory (and it's children directories).
I want to begin by explaining a little bit
about what a crawler is, some of you might be like... a what? :)
A crawler is a script that will basically
return matching items you specify for it to find... I think the best example you
can see is the actual code itself, so lets get started:
The first example will be a local file crawler,
now what this does is this; say you have a directory structure that looks like
this:
D:\websites\information.pdf
D:\websites\account_info.pdf
D:\websites\mysite.com\info.pdf
D:\websites\hello kitty\free_stuff.pdf
Now, notice that the PDF files are on all
different types of folder under the D:\websites folder, so that will become the
ROOT FOLDER.
<!--- define an empty
variable that will become a list of directories
to search later in the application
--->
<cfset current_directory_to_crawl =
"">
<!--- now by default
define the root folder to search, in this example D:\websites\ --->
<cfset next_directory_to_crawl =
"D:\websites\">
<!--- Now define a
variable that will tell the application later on if it should continue
At default set the value to 'one'
--->
<cfset crawl_again = 1>
<!--- now define a
variable that will count the number of files found and set it to 'zero' by
default --->
<cfset file_counter = 0>
<!--- do >>ONLY<< one
extension per run --->
<cfset extension_to_crawl = "pdf">
<!--- define a
variable to hold the file names of the files found --->
<cfset file_container = "">
<!--- create a
container to hold all files processed (If you are wanting to move them
elsewhere) --->
<cfset file_completed = "">
<!--- ok, here begin the processing because the
variable
crawl_again is set to 1 (stop when
set to 0) --->
<cfloop condition="crawl_again
neq 0">
<!--- first switch the directory
values --->
<cfset current_directory_to_crawl =
next_directory_to_crawl>
<!--- now clear the next --->
<cfset next_directory_to_crawl =
"">
<!--- Clear the file container --->
<cfset file_container =
"">
<!--- Now loop through the list of
directories to crawl and look for the extensions --->
<cfloop list="#current_directory_to_crawl#"
index="dir"
delimiters="|">
<!---- now list the directory contents --->
<cfdirectory
action="LIST"
directory="#dir#"
name="CurrentPull">
<!--- first get all the files --->
<cfloop query="CurrentPull">
<!---- process everything returned in the CFDIRECTORY
with the exception of the first to records which are "." and "..". Those can be
skipped for this example --->
<cfif name neq "."
OR name neq "..">
<!--- display the current file/directory to the screen
--->
<cfoutput>#name#<BR></cfoutput>
<!--- lets see if the current item is a file or
directory --->
<cfif type eq "dir">
<!--- Found a directory, set this folder as crawlable
so on the next loop we can search it for PDF files --->
<cfset next_directory_to_crawl =
ListAppend(next_directory_to_crawl, dir & name &
"\", "|")>
<cfelseif type eq "file">
<!--- this is a file, see if the extension of the file
is the one defined above --->
<cfif ListLast(name,
".") eq extension_to_crawl>
<!--- here is checks to make sure that this file and
it's path is UNIQUE --->
<cfif NOT ListFind(file_completed, dir & name,
"|")>
<!--- define this file are completed --->
<cfset file_completed = ListAppend(file_completed, dir &
name, "|")>
<!--- add the file to the container --->
<cfset file_container = ListAppend(file_container, dir &
name, "|")>
<!--- add one to the file counter --->
<cfset file_counter = file_counter + 1>
</cfif>
</cfif>
</cfif>
</cfif>
</cfloop>
</cfloop>
<!--- now output the
final values to the screen so we can see them --->
<cfoutput>
<hr><ol>
<cfloop list="#next_directory_to_crawl#"
index="folder"
delimiters="|">
<li>#folder#</li>
</cfloop>
</ol>
<hr><ol>
<cfloop list="#file_container#"
index="files"
delimiters="|">
<li>#files#</li>
</cfloop>
</ol>
<HR>Files Found: #file_counter#<hr>
</cfoutput>
<cfif next_directory_to_crawl eq "">
<!--- There are no more
folders to crawl, stop the main loop --->
<cfset crawl_again =
0>
</cfif>
</cfloop>
That's pretty much it, that will make a local
crawler to find files and much more!
Questions? Comments?
Email Me....
-
A brief demonstration of Fusebox 2.0
This is a brief demonstration on how to use Fusebox 2.0 Methodology.
Author: Pablo Varando
Views: 19,984
Posted Date: Friday, September 6, 2002
-
A quick intro into the world of Custom Tags!
The following tutorial will briefly touch over Custom Tags and show you what they are, how you use them, and how they benfit you by using them.
Author: Pablo Varando
Views: 24,063
Posted Date: Friday, September 6, 2002
-
A Simple Contact Us Page….
Learn how to create a contact page in ColdFusion.
Author: Pablo Varando
Views: 32,228
Posted Date: Tuesday, August 13, 2002
-
Alternating Row Colors!
This tutorial will demonstrate how to alternate row colors when outputing your data.
Author: Pablo Varando
Views: 32,085
Posted Date: Tuesday, September 17, 2002
-
Automatically Adding Smiles To Your Messages!
This tutorial will show you how you can add smiles to your messages on the fly!
Author: Pablo Varando
Views: 21,059
Posted Date: Tuesday, October 29, 2002
-
CaSe SensitiVe password logins!
This tutorial will demonstrate how to verify users passwords to be CaSe SensiTive so add another layer of security to your applications!
Author: Pablo Varando
Views: 50,272
Posted Date: Wednesday, February 5, 2003
-
Changing the form submission page on the fly!
This tutorial is not ColdFusion oriented, but covers a great trick to allow you to submit a single form to a variety of different pages on the fly.
Author: Pablo Varando
Views: 17,009
Posted Date: Monday, December 1, 2003
-
Clearing your session variables!
This tutorial will demonstrate how to clear your applications sessions variables with just three lines of code!
Author: Pablo Varando
Views: 25,458
Posted Date: Friday, October 4, 2002
-
ColdFusion and .INI Files!
This tutorial will demonstrate how to use .INI files with ColdFusion. Perfect for users wishing to create administration areas for existing software titles that are INI file driven (i.e. FTP Servers).
Author: Pablo Varando
Views: 19,414
Posted Date: Friday, October 4, 2002
-
Combining two queries into one..
This tutorial will demonstrate how to create a query from two different queries based from two separate datasources. This is the easiest way to combine your data.
Author: Pablo Varando
Views: 21,412
Posted Date: Monday, March 10, 2003
-
Correct Content (document) serving!
This tutorial will demonstrate how to correctly serve documents via ColdFusion and allow you to correctly name the download as you see fit!
Author: Pablo Varando
Views: 15,206
Posted Date: Tuesday, February 10, 2004
-
Count Active Users On Your Site.
Have you ever wanted to display a count of how many people are on your web site at any given moment? This tutorial will demonstrate to you how to achieve just that. It will count your web site's active sessions and allow you to display them to your visitors.
Author: Pablo Varando
Views: 30,782
Posted Date: Sunday, August 25, 2002
-
Creating a file content crawler with ColdFusion....
This tutorial will show you how to make a file content crawler with ColdFusion to find specified documents in a folder and its children folders. (Similar to find files or folder in Windows(c) Operating Systems 'find' feature).
Author: Pablo Varando
Views: 24,831
Posted Date: Saturday, July 19, 2003
-
Creating a Newsletter System....
This tutorial will show you how to create a fully automated system to allow visitors to subscribe and unsubscribe to your newsletter, and for administrators to send out a newsletter to all the registered users.
Author: Pablo Varando
Views: 29,743
Posted Date: Friday, September 6, 2002
-
Creating a user athentication (Login) area.
This tutorial will demonstrate how you can create a "member's only" area. It will show you how to log them in and how to check that they are logged in.
Author: Pablo Varando
Views: 75,972
Posted Date: Monday, August 19, 2002
-
Creating an ODBC Connection within ColdFusion MX Server...
This tutorial will show you how to create an ODBC (Database) connection from within your ColdFusion MX Administration Area.
Author: Pablo Varando
Views: 28,977
Posted Date: Monday, January 6, 2003
-
Creating your very own RSS XML Feeds with ColdFusion MX!
Have you ever wanted to create your very own RSS XML News Feeds? This tutorial will show you how to create an RSS feed that will allow you to syndicate your web site and allow the world to easily use your data!
Author: Pablo Varando
Views: 29,623
Posted Date: Thursday, January 15, 2004
-
Creating, Altering and Deleting database tables with ColdFusion.
This tutorial will show you how to create, modify and delete database tables easily with ColdFusion.
Author: Pablo Varando
Views: 25,948
Posted Date: Monday, October 14, 2002
-
Delete files and folders in a specified path!
This tutorial will demonstrate how you can delete all files and sub-folders in a specified folder using ColdFusion and Windows!
Author: Pablo Varando
Views: 15,816
Posted Date: Wednesday, September 7, 2005
-
Delete Records From Your Database With ColdFusion!
This tutorial will demonstrate how to delete records from a database via your website using ColdFusion.
Author: Pablo Varando
Views: 26,177
Posted Date: Friday, July 4, 2003
-
Do you want to remember your members?
This tutorial will show you how to you can provide your members with the ability to save their username and password into memory, so they dont have to type it in everytime the want to log in to your web site.
Author: Pablo Varando
Views: 22,419
Posted Date: Tuesday, May 13, 2003
-
DSNLess Coldfusion?
Learn how to create database connection, by skipping the old ODBC connections with ColdFusion.
Author: Pablo Varando
Views: 24,097
Posted Date: Friday, August 16, 2002
-
Dynamic Last Date Modified?
This tutorial will demonstrate how to display the date a web page was last modified to your visitors dynamically.
Author: Pablo Varando
Views: 13,498
Posted Date: Monday, April 12, 2004
-
Get A Folder Size Using ColdFusion and FSO...
This tutorial will demonstrate how you can get the size of a folder (and sub folders) using ColdFusion and Windows File System Object (FSO).
Author: Pablo Varando
Views: 18,575
Posted Date: Tuesday, April 8, 2003
-
Having Your Database Do The Work… not ColdFusion!
This tutorial will demonstrate how you can use the functions that come built in on your database to do the work, instead of doing the work in your code the hard way!
Author: Pablo Varando
Views: 26,412
Posted Date: Thursday, August 8, 2002
-
Implementing FORM Error Checking On Your Pages!
This tutorial will show you two two ways you can implement error checking, to ensure that your users are actually entering the required fields on your forms!
Author: Pablo Varando
Views: 20,628
Posted Date: Wednesday, October 16, 2002
-
Inserting data into a database
This tutorial will show you how to insert data into a database, then have that information emailed to you and the person submitting the data.
Author: Pablo Varando
Views: 39,214
Posted Date: Thursday, August 1, 2002
-
Inserting FORM data into multiple database tables!
This tutorial will demonstrate how you can use one form a user fills out to insert into multiple database tables.
Author: Pablo Varando
Views: 24,709
Posted Date: Tuesday, October 15, 2002
-
Preventing People From Leeching Your Images!
This tutorial will show you how to load your images from an actual .cfm page. Therefore, allowing you to prevent people from using your content on their web sites.
Author: Pablo Varando
Views: 28,357
Posted Date: Friday, March 14, 2003
-
Previous / Next n Records
This tutorial demonstrate how to implement "Previous" & "Next" into your existing results page.
Author: Pablo Varando
Views: 30,321
Posted Date: Tuesday, September 17, 2002
-
Print your web pages on the fly!
This tutorial will demonstrate how use ColdFusion, Javascript and Style sheets to create the perfect Printing Machine! ;)
Author: Pablo Varando
Views: 25,739
Posted Date: Sunday, December 15, 2002
-
Processing XML/RSS feeds with ColdFusion MX
This tutorial will show you how to parse XML files (RSS Feeds) with ColdFusion MX and it uses an EasyCFM.COM Feed for example [Feed: 5 Most Viewed Tutorials]. It shows you how to call it via CFHTTP all the way to parse and display your records!
Author: Pablo Varando
Views: 24,762
Posted Date: Saturday, December 27, 2003
-
Reading your IIS Log Files with ColdFusion!
This tutorial will show you how you can parse through your IIS (web server) log files and insert the values into a database, therefore allowin you to display real-time stats to your visitors (i.e hits this week, etc..)
Author: Pablo Varando
Views: 29,004
Posted Date: Monday, November 4, 2002
-
Retrieving Records From a Database..
This is the basics of ColdFusion. This tutorial will demonstrate how to query a database and then display the records found.
Author: Pablo Varando
Views: 31,611
Posted Date: Saturday, August 3, 2002
-
Sending multiple attachments with CFMAIL!
This tutorial will demonstrate how to send out multiple attachments with .
Author: Pablo Varando
Views: 34,480
Posted Date: Friday, October 11, 2002
-
User Defined Functions....
Learn how to use User-Defined Functions in ColdFusion 5.0.
Author: Pablo Varando
Views: 18,284
Posted Date: Wednesday, August 21, 2002
-
Using <CFPOP> and creating an email client for POP3 Email Reading!
This tutorial will show you how to create a mail system for your site. It will allow you to get your email from a POP3 server, view your inbox, then view the message (with attachments), reply and delete that message as well.
Author: Pablo Varando
Views: 27,949
Posted Date: Thursday, November 7, 2002
-
Using Arrays in ColdFusion To Properly Display Data....
This tutorial will show you how to use arrays to display data properly in ColdFusion.
Author: Pablo Varando
Views: 24,463
Posted Date: Monday, October 28, 2002
-
Using CFRegistry to Add Your IP To CF Debug IP List!
This tutorial is intended to show you how to use the ColdFusion tag <CFRegistry>. This tutorial will show you how to add your current IP to the IP Debug List in the ColdFusion Administrator.
Author: Pablo Varando
Views: 21,051
Posted Date: Wednesday, November 6, 2002
-
Using PayPal's IPN with ColdFusion!
This tutorial will demonstrate how to implement the PayPal IPN (Instant Payment Notification) into your e-commerce applications to accept credit cards in real time!
Author: Pablo Varando
Views: 62,244
Posted Date: Wednesday, September 25, 2002
-
Using Query String Values....
This tutorial will demonstrate how to use query string values instead of form values.
Author: Pablo Varando
Views: 34,380
Posted Date: Sunday, September 15, 2002
-
What is the ID for the record I just inserted?
This tutorial will demonstrate how you can get the ID of the record you have just inserted without having to connect to the database again!
Author: Pablo Varando
Views: 21,243
Posted Date: Monday, August 11, 2003