[Monday, April 17, 9:55]

> Now I have a question. In sql,  the FROM clause should be executed before
> the WHERE, right?
> why my code not working:
>
> select table_name, x
> from (select table_name, count(column_name) x
>     from cols
> where not exists (select * from conum y
>                   where conum.x < y.x);
>
> the error mesg is
> where not exists (select * from conum y
>                                                 *
> ERROR at line 5:
> ORA-00942: table or view does not exist
>
> Could you give me an explain on this?
>
> Thank you very much.

When you write a subquery in the FROM clause, you do not create a new table.
"conum" is a tuple variable which ranges over the result of the subquery,
but when you need another tuple variable over the same subquery result,
you must write the subquery again.
In this case it might be simpler to define the subquery as a view.
If you do this, you give the result of the subquery a name, and can refer to
it any number of times.
Otherwise, the result of the subquery has no name, only the tuple variable
has a name.

-------------------------------------------------------------------------------

[Monday, April 17, 10:30]

> I have a couple of questions on the recent homework.
>
> Part b)
> Why doesn't this work?
>
> SELECT TABLE_NAME, COUNT(COLUMN_NAME)
> FROM DICT_COLUMNS
> GROUP BY TABLE_NAME
> HAVING COUNT(COLUMN_NAME) = MAX(SELECT COUNT(COLUMN_NAME)FROM DICT_COLUMNS
> GROUP BY TABLE_NAME);
>
> Here is the error message:
>
> HAVING COUNT(COLUMN_NAME) = MAX(SELECT COUNT(COLUMN_NAME)
>                                 *
> ERROR at line 4:
> ORA-00936: missing expression

You cannot nest aggregations in this way.
The argument of MAX must be a column or something similar,
but a subquery is not allowed, although it would make sense.
After all, aggregations like MAX take a set as input and your subquery returns
such a set. But the SQL syntax is restricted in this way.

> Part e)
> I'm not sure I fully understand the question. Here is the SQL I am running:
>
> SELECT DISTINCT a.CONSTRAINT_NAME, a.COLUMN_NAME, a.TABLE_NAME
> FROM USER_CONS_COLUMNS a, USER_CONSTRAINTS b
> WHERE a.TABLE_NAME = b.TABLE_NAME
> AND b.TABLE_NAME = 'RESULTS'
> AND b.CONSTRAINT_TYPE = 'R'
> ORDER BY a.CONSTRAINT_NAME;
>
> Here is a sample result:
>
> CONSTRAINT_NAME                          COLUMN_NAME     TABLE_NAME
>  -------------------------                          --------------------
>  -----------------
>   POINTS_MUST_BE_GREATER_0      POINTS                     RESULTS
>   RESULTS_KEY                                    CAT
> RESULTS
>   RESULTS_KEY                                    EX_NO
> RESULTS
>   RESULTS_KEY                                    STUD_ID
> RESULTS
>   RESULTS_REF_EXERCISES              CAT                             RESULTS
>   RESULTS_REF_EXERCISES              EX_NO                       RESULTS
>   RESULTS_REF_STUDENTS               STUD_ID                    RESULTS
>   SYS_C00857                                        CAT
> RESULTS
>   SYS_C00858                                        EX_NO
> RESULTS
>   SYS_C00859                                        STUD_ID
> RESULTS
>   SYS_C00860                                        POINTS
> RESULTS
>   SYS_C00861                                        ENTERED
> RESULTS
>
> Is this what you are looking for?

No, the output should be something like this:

CONTSTRAINT_NAME          COLUMN_NAME   TABLE_NAME
================          ===========   ==========
RESULTS_REF_EXERCISES     CAT           EXERCISES
RESULTS_REF_EXERCISES     EX_NO         EXERCISES
RESULTS_REF_STUDENTS      STUD_ID       STUDENTS

I agree that the formulation is not very clear.
I really should have put this table into the exercise to clarify what I want.

Your join between USER_CONS_COLUMNS a and USER_CONSTRAINTS b does not work
because you check only that the table name is the same.
You must check that the constraint name and owner agree.

Also, it might be that two tuple variables are not enough for this query
(I am not sure, but my solution needs three).

I hope this helps.

-------------------------------------------------------------------------------

> I have a question regarding data dictionary (14-9).  DICT and DICT_COLUMNS
> seem to have different number of distinct TABLE_NAME.  Could you explain why?
> It looks like that DICT_COLUMNS include more dynamic performance tables,
> but I'm not sure.  I came up with this question while I was working on the
> homework 11-b.  Am I supposed to use all the tables listed in DICT_COLUMNS or
> only the ones listed in DICT?  Perhaps it'll all get clearer to me once I've
> read the textbook (I haven't because I wanted to work on the homework first
> in case I need to ask you some questions).  I'd appreciate it if you could
> either explain it or direct me to the section in the textbook.

Yes, you are right. I checked it with
        select distinct table_name from dict_columns x where not exists
        (select * from dict y where y.table_name = x.table_name)
This indeed lists a lot of dynamic performance tables (V$*).

I tried the query again as DBA, and the output was empty.
So the Oracle Programmer who defined DICT used your status as normal user or
DBA, whereas the definition of DICT_COLUMNS seems to lack this check.
You might check
        $ORACLE_HOME/rdbms/admin/catalog.sql
to see the definitions of these views.

In Homeback 11b) it is probably simplest to consider all tables listed in
DICT_COLUMNS. If you believe that the other interpretation of the exercise
would be simpler, you can do that.

-------------------------------------------------------------------------------

[Monday, April 17, 15:00]

> I seem to be having a problem with part B of the homework.
>
> The SQL I am using is as follows:
>
> SELECT d.Table_Name, COUNT(dc1.Column_Name)
> FROM  Dict d, Dict_Columns dc1, Dict_Columns dc2
> WHERE d.Table_Name=dc1.Table_Name
> GROUP BY d.Table_Name
> HAVING COUNT(dc1.Column_Name)>=COUNT(dc2.Column_Name)
>
> The problem is that this ran for over 2 hours without a response.

DICT_COLUMNS has 4835 rows, and you compute the cartesian product between
two copies of DICT_COLUMNS, so the system will consider 23377225 combinations
(not to mention that table DICT, for which you have a join condition.
But still, in two hours this should have been finished.
Maybe it is so slow because DICT_COLUMNS is itself not a stored table, but
a view, which itself contains a UNION ALL and a join of several tables
(see $ORACLE_HOME/rdbms/admin/catalog.sql).

However, comparing counts in the HAVING clause will very seldom work in this
way. Unless you use DISTINCT or an attribute which can be null, COUNT will
return the same number, no matter which attribute you use as input.
You count here the combinations of tuples which are constructed in the where
clause. So for a table with 10 columns, both counts will return
10 * 4835 = 48350. Basically, in a single SELECT FROM WHERE GROUP BY,
you can construct only one set of groups. You can do independent counts only
if you use subqueries.

You must use something like COUNT(...) >= ALL (subquery which returns the
number of columns of every data dictionary table).

-------------------------------------------------------------------------------

[Monday, April 17, 15:10]

> I have a couple questions for part c and d of homework #10.  For part c, I
> can not give read privileges to "scott" through the grant command because I
> logged in as "scott" when I created the view (so scott already has read
> privileges).  Do I have to create the tables and view under a different
> user, so I can give scott read access?
>
> For part d, I assume that we should revoke access rights for scott, so he
> can not access the base table.  If so, should we revoke all privileges for
> scott, or just read access?

Yes, you need two different accounts.
You must create the tables under another account, not under "scott".
If you work at home with Personal Oracle, you have two options:
        - Create the tables as SYSTEM
          (not very elegant, since SYSTEM is a system account)
        - Create a new user as explained in the security chapter

Then you also don't have to revoke the access rights, you only not grant them.
I don't know whether you can revoke rights from the owner of an object
(I have never tried that and don't have much time now).

-------------------------------------------------------------------------------

[Tuesday, April 18, 9:30]

> I have a question about letter b on homework 11.  I was having
> difficulties with the query and I read the web page and found out what I
> was doing wrong.  Here is the first query that I tried:
>
> SQL> SELECT TABLE_NAME, COUNT(COLUMN_NAME)
>   2  FROM DICT_COLUMNS
>   3  GROUP BY TABLE_NAME
>   4  HAVING COUNT(COLUMN_NAME) >= ALL (SELECT COUNT(COLUMN_NAME)
>   5                                     FROM DICT_COLUMNS
>   6                                     GROUP BY COLUMN_NAME);
>
> no rows selected
>
> and the second query:
>
> SQL> SELECT DISTINCT TABLE_NAME, COUNT(COLUMN_NAME)
>   2  FROM DICT_COLUMNS
>   3  GROUP BY TABLE_NAME
>   4  HAVING COUNT(COLUMN_NAME) >= ALL (SELECT DISTINCT COUNT(COLUMN_NAME)
>   5                                     FROM DICT_COLUMNS 
>   6                                     GROUP BY COLUMN_NAME);
>
> no rows selected
>
>
> Can you help me with why these don't work?

They are nearly correct, but the error is that you group by the column name
in the subquery. For such "maximum count" or "maximum sum" queries, you must
group by the same attribute in both, the main query and the subquery.
You want that the count of columns for a table is maximal among all tables.

Whether you use DISTINCT in the subquery or not, doesn't change anything,
>= ALL needs to see each distinct value only once.
I am not sure which version runs faster, but I would assume (?) that the first
one in slightly faster. But this is really only a guess.

-------------------------------------------------------------------------------

[Tuesday, April 18, 9:40]

> Here is my query for homework 11, question e:
>
> Why is it returning no results?
>
>
> SQL> SELECT DISTINCT U.CONSTRAINT_NAME, U.COLUMN_NAME, U.TABLE_NAME
>   2  FROM USER_CONS_COLUMNS U, USER_CONSTRAINTS C
>   3  WHERE U.TABLE_NAME = C.TABLE_NAME
>   4  AND C.OWNER = U.OWNER
>   5  AND U.CONSTRAINT_NAME = C.R_CONSTRAINT_NAME
>   6  AND C.TABLE_NAME = 'RESULTS'
>   7  AND C.CONSTRAINT_TYPE = 'R'
>   8  ;
>
> no rows selected

Your join condition is too strong: In line 5 you check that U is for the
primary key constraint referenced by the foreign key constraint C.
In line 3 you check that both constraints are for the same table.
This would return only rows if a foreign key references its own table.

You need a tuple variable like C for the foreign key and a tuple variable
like U for the referenced primary key. U.TABLE_NAME should then appear in the
output as the referenced table. However, you do not need the columns of U,
so you can declare U as ranging over USER_CONSTRAINTS (although it works both
ways). In addition, however, you need a third tuple variable ranging over
USER_CONS_COLUMNS to get the columns of the foreign key.
In rge example, the columns of the foreign key and the primary key have the
same name, but this need not be the case, and the exercise requires the columns
of the foreign key.

-------------------------------------------------------------------------------

[Tuesday, April 18, 10:10]

> I ran the query below and left it run for 1/2 hour before I decided to stop
> it. I think in theory this query is correct but it takes a very long time to
> run. Is there something wrong with this that I am missing?
>
> SELECT TABLE_NAME, COUNT(COLUMN_NAME)
> FROM DICT_COLUMNS
> GROUP BY TABLE_NAME
> HAVING COUNT(COLUMN_NAME) >= ALL (SELECT COUNT(COLUMN_NAME)
> FROM DICT_COLUMNS
> GROUP BY TABLE_NAME);
>
> I was thinking that it sort of makes sense that this takes so long to run.
> There are around 600 tables whose columns need to be counted. This needs to
> happen twice, once for each SELECT statement. Then each number in query 1
> has to be compared to each number in query 2. That's 600 x 600 or 360,000
> comparisons. Just for argument's sake, if it makes 10 comparisons a second,
> it will take 36,000 seconds or 10 hours to just make the comparisons.
>
> Please let me know if this thinking is correct. What would be a more
> efficient way to write it?

This is an interesting example.
I should have checked it myself before I pose it as a homework problem.
Your solution is exactly what I would have written.

I believe that the Oracle Query Optimizer is a bit stupid here.
The problem is that the subquery is a non-correlated subquery,
so it needs to be evaluated only once, and in that case the query runs fine.
E.g. I tried the following:

CREATE TABLE DICT_COL_NUMS(NUM) AS
SELECT COUNT(COLUMN_NAME) FROM DICT_COLUMNS GROUP BY TABLE_NAME;

(This is not part of the SQL Standard, but it is a useful Oracle extenstion.)

SELECT TABLE_NAME, COUNT(COLUMN_NAME)
FROM DICT_COLUMNS
GROUP BY TABLE_NAME
HAVING COUNT(COLUMN_NAME) >= ALL (SELECT NUM FROM DICT_COL_NUMS)

This is done quite quickly and the Oracle Optimizer should really evaluate
the original query in this way (or an even more intelligent way, by computing
it only once, ordering the results by the number, and printing the first row).

However, the manual "Oracle8 Concepts" (see acroread $ORACLE_HOME/doc/index.pdf)
explains on page 20-16, that your query will be transformed in a first step to

SELECT TABLE_NAME, COUNT(COLUMN_NAME)
FROM DICT_COLUMNS                                           
GROUP BY TABLE_NAME
HAVING NOT (COUNT(COLUMN_NAME) < ANY (SELECT COUNT(COLUMN_NAME)
FROM DICT_COLUMNS
GROUP BY TABLE_NAME));

And then in a second step to

SELECT TABLE_NAME, COUNT(COLUMN_NAME)
FROM DICT_COLUMNS X
GROUP BY TABLE_NAME
HAVING NOT EXISTS
(SELECT COUNT(COLUMN_NAME)
FROM DICT_COLUMNS Y
GROUP BY TABLE_NAME
HAVING COUNT(X.COLUMN_NAME) > COUNT(Y.COLUMN_NAME));

I didn't even know that this would be legal (actually, I have to read the
standard and check it in other systems, I never thought that you can use
aggregations from the main query in a subquery under HAVING).

Of course, this query also runs "forever".
But I tried the following, and this runs just fine:

SELECT DEPTNO, COUNT(EMPNO)
FROM EMP X
GROUP BY DEPTNO
HAVING NOT EXISTS
(SELECT COUNT(EMPNO)
FROM EMP Y
GROUP BY DEPTNO
HAVING COUNT(X.EMPNO) > COUNT(Y.EMPNO))

    DEPTNO COUNT(EMPNO)
---------- ------------
        10            3

By the way, the other solution which uses an explicit nested aggregation to
find the maximum column number, i.e. uses the subquery
SELECT  MAX(NUM)
FROM    (SELECT COUNT(*) NUM
         FROM DICT_COLUMNS
         GROUP BY TABLE_NAME) X
works also fine. Here the optimizer undestands that it is a non-correlated
subquery.

-------------------------------------------------------------------------------

[Wednesday, April 19, 9:25]

> Question on part c.
>
> When I use the query
>
> SQL> Select *
>   2  from ALL_CATALOG
>   3  where Owner = 'SYSTEM';
>
> I get a couple views (PRODUCT_PRIVS and another I
> can't recall right now) and a table USER_PROFILE
>
> I tried selecting * from each of them and I can only
> access USER_PROFILE.
>
> For the sake of the question, the only "table" is
> USER_PROFILE, but why the descrepency with the views?

You should be able to access all the views, tables etc. listed in ALL_CATALOG.
Unless there is a bug in the view definition, its goal ist to show you exactly
those table-like objects which you can access.

However, you must normally pubt the owner in front of the table name. E.g.:

        SQL> select * from PRODUCT_PRIVS;
        select * from PRODUCT_PRIVS
              *
        ERROR at line 1:
        ORA-00942: table or view does not exist


        SQL> select * from SYSTEM.PRODUCT_PRIVS;

        no rows selected

By chance, USER_PROFILE is not only a table, but also declared as a public
synonym:

SQL> select * from all_catalog where table_name = 'USER_PROFILE';

OWNER                          TABLE_NAME                     TABLE_TYPE
------------------------------ ------------------------------ -----------
SYSTEM                         USER_PROFILE                   TABLE
PUBLIC                         USER_PROFILE                   SYNONYM

Your own synonyms and public synonyms do not require an owner when you access
them (this probably was the main motivation for introducing synonyms).

-------------------------------------------------------------------------------

[Wednesday, April 19, 9:35]

> For question D, I am not certain of what you want for the answer.
>
> Are you looking for a projection on TABLE_NAME, and then set the
> OWNER = 'MYNAME' and the OBJECT_TYPE = 'INDEX'
> (like in the query below)
>
> SELECT TABLE_NAME
> FROM ALL_OBJECTS
> WHERE OWNER = 'YOURNAME'
> AND OBJECT_TYPE = 'INDEX'
>
> Or does an object type indicate ownership by a user, which in that case
> all we would need would be
>
> SELECT TABLE_NAME
> FROM ALL_OBJECTS
> WHERE OBJECT_TYPE = 'INDEX'

First, ALL_OBJECTS uses OBJECT_NAME for the name, not TABLE_NAME.

Otherwise the first query would be correct (with 'YOURNAME' replaced by your
Oracle login name or  by USER).
It gives you the names of the indexes which were created on your behalf
(e.g. for primary and unique keys in your tables).

However, if you are interested in objects owned by you, it is easier to
use the view USER_OBJECTS (which can be abbreviated to OBJ thanks to a public
synonym). It has the condition OWNER = USER builtin, so you do not have to
write it down explicitly.

I agree that the question is not very clear, it could also ask for on which
columns of which tables does an index exist. This is a bit more complicated,
and I didn't tell you about USER_INDEXES and USER_IND_COLUMNS which contain
this information. If you like, you can find the indexed columns, too
(maybe together with the index name and the table name). But this is not
required for this homework.